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ABSTRACT 


In  this  thesis,  we  will  investigate  the  adaptive  stochastic 
control  of  linear  dynamic  systems  with  purely  random  parame- 
ters. Hence  there  is  no  posterior  learning  about  the  system 
parameters.  The  control  law  is  non-dual;  still  it  has  the 
qualitative  properties  of  an  adaptive  control  law.  In  the 
perfect  measurement  case,  the  control  law  is  modulated  by  the 
a priori  level  of  uncertainty  of  the  system  parameters.  The 
Certainty-Equivalence  Principle  does  not  hold. 

This  thesis  shows  that  the  optimal  stochastic  control  of 
dynamic  systems  with  uncertain  parameters  has  certain  limi- 
tations. For  the  linear-quadratic  optimal  control  problem, 
it  is  shown  that  the  infinite  horizon  solution  does  not  exist 
if  the  parameter  uncertainty  exceeds  a certain  quantifiable 
threshold.  By  considering  the  discounted  cost  problem,  we 
have  obtained  some  results  on  optimality  versus  stability 
for  this  class  of  stochastic  control  problems. 

For  the  noisy  sensor  measurement  case,  we  obtained  the  opti- 
mal fixed  structure  estimator-controller.  The  control  law 
requires  the  solution  of  a coupled  nonlinear  two-point 
boundary  value  problem.  Computer  simulations  of  the  forward 
and  backward  difference  equations  provided  some  insight  into 
the  uncertainty  threshold  for  the  closed-loop  system.  Sto- 
chastic stability  analysis  further  resulted  in  a sufficient 
condition  for  the  mean  square  stability  of  the  fixed  structure 
dynamic  compensator. 
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CHAPTER  1 
INTRODUCTION 


1.1  A Historical  Survey  of  Adaptive  Stochastic  Control 

The  theory  of  optimal  closed-loop  control  of  stochastic 
linear  dynamic  systems  has  progressed  since  the  original  con- 
tributions in  [1],  [2].  For  discrete-time  linear  dynamic 

systems  with  known  system  parameters  and  known  additive  gaussian 
noise  statistics  with  quadratic  cost,  the  optimum  solution  to 
the  stochastic  control  problem  is  given  by  the  Separation 
Theorem  [3] , [4] . These  stochastic  control-theoretic  results 

have  been  reconciled  with  the  statistical  decision-theoretic 
results  given  by  the  Certainty-Equivalence  Principle  for  multi- 
stage decision  processes  [5] , [6] . 

For  linear  dynamic  systems  with  uncertain  parameters 
or  unknown  noise  statistics,  there  does  not  exist  at  present 
a general  computationally  feasible  theory  of  optimum  stochastic 
control.  Bellman  first  presented  a mathematical  theory  of 
adaptive  control  processes  in  [7] . He  introduced  the  concepts 
of  "information  pattern"  and  a control  device  that  can  "learn". 
Feldbaum  expanded  on  the  concept  and  algorithms  of  adaptive 
control  in  his  four-part  theory  of  dual  control  [8] , so-called 
because  the  optimum  controller  must  actively  try  to  identify 
the  unknown  parameters  as  well  as  simultaneously  control  the 
system.  He  showed  that  in  dual  control  systems,  there  may 
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exist  Inht'rent  conflict  between  applying  the  inputs  for  learn- 
ing and  for  effective  control  purposes.  The  dual  control  law 
is  then  to  reflect  the  optimum  interaction  of  caution  and 
probing  in  the  closed-loop  control  system.  Feldbaum  then 
distinguished  between  two  kinds  of  loss,  one  due  to  the  de- 
viation of  the  state  and  the  other  due  to  the  nonoptimal 
learning  control  law  19]. 

The  concepts  of  separation,  certainty-equivalence, 
neutrality,  and  related  dual  control  effects  have  been  fur- 
ther clarified  and  discussed  in  [10]— [1G] . The  present  dual 
control  action  may  influence  future  learning.  In  the  so- 
called  neutral  control  systems  described  in  117],  [18],  learn- 

ing is  independent  of  the  control  law.  The  neutral  control 
law  accounts  for  present  uncertainty,  but  neglect  the  possi- 
bility that  the  present  control  action  may  influence  future 
uncertainty  resulting  thus  in  a one-way  separation. 

Optimal  solutions  to  the  adaptive  stochastic  control 
of  a class  of  linear  dynamic  systems  with  constant  or  time- 
varying  unknown  parameters  can  be  obtained,  in  principle, 
using  the  stochastic  dynamic  programming  method.  The  opti- 
mization algorithm  is  constructive  and  the  solution  is  ob- 
tained by  solving  a recursive  functional  equation  involving 
alternating  minimizations  and  expectations,  [8],  However, 
due  to  the  "curse  of  dimensionality"  the  solution  in  general 
cannot  be  obtained  analytically  in  closed  form.  The  dynamic 


— 


programming  algorithm  encounters  the  problem  of  infinite 
dimensionality  of  the  probability  distribution  function  in 
the  general  case. 

Since  we  cannot  solve  analytically  the  adaptive  con- 
trol problem  except  for  very  special  cases  119],  [20],  in 

practice  we  resort  to  approximation  methods.  The  degradation 
in  performance  of  the  suboptimal  adaptive  control  law  can  be 
measured  by  comparing  the  average  performance  of  the  proposed 
suboptimal  control  algorithm  obtained  from  Monte  Carlo  simula- 
tions with  the  optimal  but  unattainable  performance  for  the 
same  control  system  in  which  the  parameters  are  known  with 
certainty . 

There  are  two  approaches  to  the  approximation  of  the 
optimal  adaptive  control  law.  First,  we  may  approximate  the 
optimal  solution  to  the  adaptive  stochastic  control  problem. 
This  approach  is  taken  in  [7] , [8] , [11]  , [21-23] . The 

second  approach  is  to  approximate  the  linear  system  as  one  with 
random  parameters  and  derive  the  optimal  adaptive  stochastic 
control  for  the  approximate  control  system.  This  can  be  done 
by  relaxing  certain  mathematical  assumptions  and  information 
structure  of  the  optimal  adaptive  control  law.  In  doing  so, 
we  may  be  able  to  obtain  the  suboptimal  control  law  analyti- 
cally. One  such  method  is  the  enforced  separation  as  in  [24]. 
Another  is  the  open-loop  feedback  technique  [25] -[30]. 
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I.iterature  surveys  and  reviews  of  the  state-of-the- 
art  of  adaptive  control  concepts  and  methods  are  found  in 
[31]— [33] . An  extensive  bibliography  on  the  theory  and 
application  of  the  various  suboptimal  adaptive  estimation 
and  control  techniques  is  given  in  [34]. 

In  this  thesis,  we  will  investigate  a class  of 
stochastic  optimal  control  problems  with  purely  random  (white) 
parameters  whose  mathematical  solutions  reflect  some  of  the 
aspects  of  adaptive  stochastic  control  laws,  Fig.  1.1. 

The  use  of  multiplicative  white  noise  parameters  explicitly 
tells  the  mathematics  that  the  system  dynamics  are  not  known 
exactly  and  can  vary  in  an  unpredictable  way.  This  is  an 
important  class  of  problems  because  it  represents  a worst 
case  design  and  analysis.  The  results  provide  some  insights 
and  help  to  evaluate  whether  the  use  of  very  sophisticated 
identification  and  control  algorithms  may  represent  an 
"overki 11". 

Optimum  control  of  linear  systems  with  statistically 
independent  random  parameters  is  considered  in  [35].  For  a 
constant  linear  system  with  multiplicative  input  noise,  the 
effect  of  the  random  parameters  was  found  to  show  the  con- 
vergence of  the  feedback  coefficients  [2].  Necessary  and 
sufficient  conditions  for  a class  of  stationary  linear  system 
with  random  parameter  to  be  controllable  in  mean-square  sense 
was  examined  in  [36].  Solution  to  the  optimal  stochastic 


s 
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control  problem  with  independent  random  parameter  has  been 
derived  in  {37],  |3B],  and  [39]. 

The  mathematical  formulation  of  the  stochastic  con- 
trol problem  with  uncertain  parameters  forces  the  solution  to 
be  without  any  learning.  In  particular,  we  consider  the 
linear  dynamical  system 


x(t+l)  = A(t)x(t)  + B(t)u(t)  + £(t) 


(1.1.  1) 


t = 0,1,2 N-l 


For  simplicity  we  shall  assume  that  the  measurement  is  exact. 
The  structure  of  the  matrices  A(t)  and  B( t ) are  known  but  the 
elements  contain  uncertain  parameters.  £( t ) is  the  plant  white 
noise  (disturbance).  The  cost  functional  to  be  minimized  is 
given  by  the  scalar 

\ N;1  ) 

J » E \x'(N)Fx(N)  + £ x'(t)Q(t)x(t)  + u ' ( t )R(  t )u  ( t )> 

( t=0  I 

(1.1.2) 

where  F,  Q( • ) , and  R( • ) are  at  least  positive  semi-definite. 

The  uncertain  parameters  in  A( • ) and  B( • ) change 
randomly  with  time.  At  each  instant  of  time,  "nature"  selects 
the  value  of  the  system  parameters  from  some  a priori  given 
distribution.  The  way  "nature"  selects  the  particular  numeri- 
cal value  of  system  parameters  at  each  instant  of  time  repre- 
sent a chance  event  in  time.  That  is,  the  time-varying 
parameters  represent  a white  process.  Hence,  the  mathematics 
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tells  the  compensator  that  it  cannot  use  the  measurement  data 
to  improve  the  a prior  mean  or  reduce  the  level  of  uncertainty 
of  parameters  anymore  than  the  a prior  variance.  The  optimal 
solution  cannot  involve  any  learning  about  the  system  parameters. 

Although  the  mathematical  formulation  of  the  problem 
precludes  identification,  the  solution  of  the  optimal  stochastic 
control  problem  in  the  sense  of  minimizing  a cost  functional 
shows  the  effects  of  parameter  uncertainty  in  the  performance 
of  the  control  system.  The  control  gain  of  an  optimal  stochas- 
tic system  with  randomly  varying  parameters  will  depend  upon 
the  unconditional  means  and  covariances  of  the  uncertain 
parameters.  The  Separation  Theorem  does  not  hold.  Random- 
ness in  the  system  parameters  has  strong  influence  on  the  gain 
of  the  control  system,  even  in  the  absence  of  any  learning. 

The  minimum  value  of  the  expected  quadratic  cost 
depends  not  only  upon  the  means  but  also  upon  the  variance  of 
the  randomly  varying  parameters.  In  the  worst  case  sense,  one 
has  then  an  upper  bound  upon  the  performance  deterioration  of 
the  control  system  due  to  uncertain  parameters.  The  difference 
between  this  worst  case  cost  and  the  Separation  Theorem  cost 
is  the  so-called  value  of  model  information  for  stochastic 
adaptive  control  problems. 

This  class  of  stochastic  control  problems  is  closely 
related  to  the  state-dependent  and  control-dependent  noise 
problem  considered  in  continuous-time  for  perfect  measurement 
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140]  to  [45]  and  in  discrete-time  for  noisy  measurement, 

[46]  to  [49].  The  specific  class  of  stochastic  models  given 
in  Eq . 1.1.1  are  also  known  as  the  multiplicative  noise  or 
random  coefficient  (multiplier)  models.  In  [20]  it  is  shown 
that  if  the  only  uncertainty  parameter  in  Eq . 1.1.1  is  in  the 
matrix  B then  the  nonlinear  stochastic  control  system  is 
essentially  a bilinear  system.  Hence  the  results  for  the 
class  of  adaptive  control  problems  are  readily  applicable  to 
the  class  of  stochastic  bilinear  systems. 

1 . 2 Structure  of  the  Thesis 

In  this  thesis,  we  will  obtain  the  results  almost 
entirely  for  the  scalar  systems.  In  the  very  simple  first- 
order  dynamical  systems,  we  have  no  problem  with  system  con- 
trollability or  observability.  The  optimized  stochastic 
control  problem  is  well-posed  and  well-defined  to  give 
existence  and  uniqueness  results.  The  analytical  results  in 
the  subsequent  chapters  for  the  scalar  linear-quadrat ic- 
Gaussian  systems  must  be  true  for  multivariable-nonlinear- 
non-Gaussian  systems  since  the  LQG  problem  is  a special 
case  of  the  more  general  formulation.  The  extension  of 
these  results  to  the  multivariable  case  is  conceptually 
straightforward,  although  notationally  cumbersome. 

The  optimal  stochastic  control  problem  with  perfect 
state  measurement  is  considered  in  Chapter  2.  The  mathematical 
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formulation  of  the  problem  is  given  in  Section  2.2.  The 
solution  to  the  "white  noise  parameters"  optimization  is  ob- 
tained using  the  stochastic  dynamic  programming  algorithm  in 
Section  2.3.  The  important  features  of  the  control  solution 
are  discussed.  In  Section  2.4,  we  examine  the  steady-state 
solution  of  the  optimal  stochastic  control  problem.  In  parti- 
cular. we  derive  the  inequality  condition  for  the  existence  of 
a finite  solution  to  the  Riccati-like  equation  for  infinite 
horizon  problem.  In  Section  2.5,  the  stochastic  optimization 
problem  is  treated  as  a stochastic  stability  problem.  We 
give  the  necessary  and  sufficient  conditions  for  the  almost 
sure  and  mean  square  stability  of  the  stochastic  system  under 
linear  feedback.  The  concepts  of  optimality  versus  stability 
is  further  brought  out  in  Section  2.6  when  we  consider  the 
discounted  cost  problem.  We  extend  the  results  in  Section  2.3 
to  the  case  where  the  multiplicative  noises  are  correlated 
with  the  additive  noise  in  Section  2.7. 

In  Chapter  3,  we  treat  the  problem  of  optimum  linear 
minimum  variance  estimation  for  the  random  parameter  system. 
The  estimation  problem  is  stated  in  Section  3.1.  The  linear 
minimum  variance  filter  is  derived  in  Section  3.2.  It  is 
found  that  the  parameter  means  and  variances  have  to  satisfy 
a necessary  and  sufficient  condition  for  the  asympotic  vari- 
ance of  the  uncontrolled  linear  system  to  be  finite  (and  this 
turns  out  to  be  sufficient  to  ensure  stochastic  stability  as 
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well).  In  Section  3.4,  we  discuss  the  case  where  the  un- 
certain parameters  are  uncorrelated.  In  Section  3.5,  the 
analysis  is  given  to  include  mutually  correlated  randomly 
varying  parameters. 

In  Chapter  4,  we  consider  the  closed-loop  (feedback) 
control  of  randomly  varying  parameters  system  with  noisy 
measurements.  The  mathematical  problem  is  formulated  in 
Section  4.2.  In  Section  4.3  we  examine  the  optimal  solution 
to  the  control  problem  using  stochastic  dynamic  programming. 

In  Section  4.4,  we  fix  the  structure  of  the  class  of  dynamic 
compensates  to  be  considered.  We  obtain  the  optimal  param- 
eters (filter  gains  and  control  gains)  first  using  the  Matrix 
Minimum  Principle  and  then  dynamic  programming  algorithm.  The 
important  point  is  that  we  transformed  the  original  stochastic 
control  problem  in  Section  4.2  into  a deterministic  parameter 
optimization  problem  in  Section  4.4.  Section  4.5  shows  that 
we  have  to  solve  a complex  coupled  nonlinear  two-point  boundary 
value  problem  in  order  to  compute  the  optimal  gains.  We  discuss 
the  various  aspects  of  the  fixed  structure  estimate-controller 
in  Section  4.6.  We  consider  the  asymptotic  behavior  of  the 
stochastic  control  law  derived  in  Section  4.7.  Numerical 
simulations  of  the  stochastic  equations  provide  the  needed 
insights  into  the  existence  of  steady-state  control  laws. 
Stochastic  ability  analysis  analogous  to  that  in  Section  2.5 
based  on  output  feedback  is  given  in  Section  4.8.  A sufficient 
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condition  for  the  stochastic  system  to  be  mean-square  stabi- 
lizable  under  feedback  is  presented. 

In  Chapter  5,  we  extend  the  results  in  Chapter  2 to 
a special  class  of  linear  multivariable  systems.  We  give  the 
mathematical  formulation  of  the  optimal  stochastic  control 
problem  in  Section  5.2.  The  solution  via  dynamic  programming 
algorithm  is  given.  In  Section  5.3,  we  consider  the  optimal 
stochastic  control  of  a multivariable  linear  system  with  a 
specific  structure  with  respect  to  a quadratic  performance 
index.  The  system  dynamics  are  described  by  a linear  vector 
difference  equation  with  white,  possibly  mutually  correlated, 
scalar  random  parameters.  In  Section  5.4  we  summarize  the 
results  on  the  adaptive  stochastic  control  of  linear  multi- 
variable  systems  with  imperfect  measurements. 

We  summarize  the  results  on  the  optimum  stochastic 
control  of  linear  dynamic  systems  with  purely  random  param- 
eters in  Section  6.1.  We  make  conclusions  about  optimality 
and  stochastic  stability  in  Section  6.2.  We  discuss  the 
existence,  finiteness,  and  convergence  of  the  derived  opti- 
mal control  law.  In  Section  6.3,  we  recommend  the  direc- 
tions for  future  research  in  this  area. 

1 . 3 Contributions  of  the  Thesis 

The  optimal  stochastic  control  results  for  the  exact 
state  measurements  problem  have  been  known  for  some  time  in 
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[37].  However,  their  potential  importance  and  their  im- 
plications in  adaptive  control  has  not  yet  been  fully 
realized.  This  thesis  reports  on  the  research  of  the  optimal 


stochastic  control  of  white  noise  parameter  systems.  The 
objective  is  to  pain  deeper  insights  and  clearer  understand- 
ing of  the  issues  and  philosophy  of  the  adaptive  control. 

Even  in  the  absence  of  learning,  the  degree  of  dynamic  un- 
certainty (as  quantified  by  the  variances  of  the  multi- 
plicative white  noise  parameters)  influences  both  the  optimal 
control  gains  and  the  optimal  value  of  the  performance  index. 

In  this  thesis  research  we  shall  analyze  stochastic 
systems  with  white  parameters  as  a worst  case  to  provide  a 
systematic  analysis  and  design  approach  to  adaptive  stochas- 
tic control.  We  derive  the  upper  bound  on  the  average  cost 
for  the  exact  measurement  and  the  noise-corrupted  measure- 
ment cases.  We  analyze  the  dual  nature  of  stochastic  control 
for  systems  with  uncertain  parameters  in  a most  transparent 
mathematical  framework.  The  mathematical  formulation  pre- 
cludes any  learning  about  the  parameters,  however. 

We  derive  the  necessary  and  sufficient  condition 
for  the  optimal  control  law  for  the  perfect  measurement  case. 
We  then  derive  the  necessary  and  sufficient  condition  for  the 
stochastic  stability  in  the  almost  sure  and  mean-square  sense 
for  the  class  of  stochastic  systems  under  consideration.  The 
Uncertainty  Threshold  Principle  then  says  there  exists  a 
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threshold  of  dynamic  uncertainty,  if  exceeded  then  optimal 
strategies  cannot  exist.  We  have  derived  the  optimality 
condition  for  the  discounted  cost  problem.  The  problem 
provides  an  interesting  and  important  case  study  of  opti- 
mality versus  stability  problem  in  stochastic  control  theory. 
We  were  also  able  to  extend  the  analysis  on  control  to  the 
case  where  the  multiplicative  noises  are  correlated  with 
additive  noises. 

In  deterministic  linear  quadratic  control  problem 
the  duality  principle  holds,  that  is,  the  linear  stochastic 
estimation  problem  is  related  through  duality  to  the  optimal 
deterministic  control  problem.  The  dual  of  the  control  prob- 
lem with  the  pair  (C',B")  is  the  estimation  problem  pair  (B,C). 
For  linear  discrete-time  systems,  duality  principle  says  that 
the  various  matrices  that  occur  in  the  optimal  regulator 
problem  and  the  optimal  state  reconstruction  problem  are 
related  and  have  symmetry  property,  [50] . We  show  that 
this  duality  property  does  not  hold  for  the  optimal  regula- 
tor and  optimum  linear  minimum  variance  estimation  problems 
for  the  class  of  adaptive  stochastic  control  problems.  In 
particular,  the  stability  condition  for  the  asymptotic  be- 
havior of  the  optimum  linear  minimum  variance  filter  problem 
cannot  be  obtained  by  "dualizing"  the  stability  condition 
for  the  optimum  regulator  problem  given  in  Section  2.4. 
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We  have  obtained  the  linear  minimum  variance  un- 
biased filter  with  deterministic  control  input.  Results  are 
generalized  to  the  case  where  all  the  random  parameters  may 
be  correlated.  The  necessary  and  sufficient  condition  for 
the  asymptotic  stability  of  the  state  second  moment  turns  out 
to  be  only  a sufficient  condition  for  the  stochastic  stability 
of  the  fixed  structure  overall  closed-loop  system. 

For  the  noisy  sensor  measurement  case,  we  derived 
the  fixed  structure  dynamic  compensator  using  dynamic  pro- 
gramming algorithm.  We  determined  the  average  cost  expression 
( in  a worst  case  sense).  The  use  of  direct  output  feedback 
is  shown  to  give  only  a sufficient  condition  for  the  mean- 
square  stability  for  the  overall  control  system. 

J 
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CHAPTER  2 

OPTIMAL  STOCHASTIC  CONTROL  FOR  THE 
PERFECT  MEASUREMENT  SYSTEM 


2 . 1 Introduction 

In  this  chapter,  the  optimal  control  problem  for 
purely  random  parameters  will  be  formulated  and  solved  lor 
the  perfect  observation  case.  We  present  the  mathematical 
model  of  a class  of  stochastic  linear  systems  in  Section  2.2 
and  give  the  technical  assumptions  about  the  statistical  laws 
for  the  random  processes.  The  optimal  stochastic  control 
problem  is  then  formulated  assuming  perfect  measurements. 

In  Section  2.3,  we  give  the  solution  to  the  optimal  control 
problem  via  dynamic  programming . In  Section  2.4,  we  examine 
the  stability  properties  of  a stationary  system.  The  Un- 
certainty Threshold  Principle  is  given  in  Theorem  2.1.  We 
examine  the  stochastic  stability  of  a linear  system  under 
linear  feedback  in  Section  2.5.  In  Section  2.6,  we  discuss 
the  discounted  cost  problem  and  give  a modified  threshold 
for  the  particular  cost  functional  chosen.  We  discuss  some 
important  new  issues  in  stochastic  controllability  and  sta- 
bility. In  Section  2.7,  we  extend  the  results  of  Sections 

2.2  and  2.4  to  linear  systems  where  the  random  parameters 
and  the  additive  noise  are  correlated. 
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2.2  Problem  Statement 

In  this  section,  we  will  state  the  problem.  Con- 
sider a first-order  stochastic  linear  dynamical  system  with 
state  x(t)  and  control  u(t)  described  by  the  difference 


equat ion 


x(t+l)  = a(t)x(t)  + b(t)u(t)  + £,(t) 


(2.2.1) 


t = 0,1,2 N-l 

x(0)  given. 

We  assume'  that  the  additive  noise  £(t)  driving  the 
system  dynamics  is  a zero-mean  Gaussian  white  coise  with 
known  variance 


El  C( t)C(r)l  = S( t )5( t , t ) 


(2.2.2) 


We  assume  that  the  purely  random  parameters  a(t)  and  b(t) 
are  Gaussian  and  white  (uncorrelated  in  time)  with  known  means 
a(t)  and  b(t),  and  covariances  Eaa(t)  and  £ (*)»  respectively 

and  cross-covariance  given  by  Eftb(t).  More  precisely,  we 


assume  that 


El a( t ) } = a( t ) 


E{  b(  t ) ) = b(t)  Vt 


(2.2.3) 


E{(a(t)  - a(t))(a(x)  - a(x))|  « Eaa(t)6(t.T)  (2.2.4) 

E |(b(  t ) — b(  t ))  (b(  x ) - b(r))|  = Ebb(t)6(t,x)  (2.2.5) 

E j(a(  t ) - a(  t ))  (b(x)  - b(x))}  = Eab(t)5(t,x)  (2.2.G) 
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where  6(t,r)  is  the  Kronecker  delta  and 

Eaa(t)Ebb(t>  - Eab<t)i0  <2-27> 

since  the  correlation  coefficient  |p|  <1. 

It  is  assumed  that  the  additive  white  noise  £(t)  is 
statistically  independent  of  the  random  parameters  a(t)  and 
b(t).  The  case  where  a(t),  b(t),  and  £(t)  are  correlated  is 
discussed  later  in  Section  2.7. 

For  the  stochastic  control  problem  it  is  very 
important  to  specify  the  information  available  for  control. 

In  this  chapter,  we  assume  that  the  state  x(t)  can  be  mea- 
sured exactly.  Hence  we  assume  that  x(0)  is  given. 

We  assume  that  the  admissible  controls  are  real- 
valued and  of  state  feedback  type  u(t)  = y(x(t),t).  The 
control  can  only  depend  on  the  given  a priori  information 
and  measurements  up  to  time  t.  The  control  u(t)  at  time  t 
can  only  influence  the  state  x(t)  at  t 2 t+1  and  not  before. 
This  is  the  important  notion  of  causal  inputs  - past  and 
present  output  values  do  not  depend  on  future  input  values. 

The  optimal  control  problem  is  to  determine  the 
control  law  u(t)  = y(x(t),t)(t  = 0,1,...  ,N-1)  such  that  the 
expected  value  of  a quadratic  cost  functional  is  minimized. 

The  quadratic  cost  functional  is  the  standard  regulator  type. 


J(0)  = E s 

a(  • ) ,b(  • ) , ( 
£(•) 


N-l 

Fx2(N)  + l x2(  t )Q(  t ) + u2(  t )R( 
t=0 


\ 

(2.2.8) 


F , Q > 0 , R > 0 
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The  expectation  is  taken  with  respect  to  the  probability 
distribution  of  the  underlying  random  variables  a(t),  b(t), 
and  f,  ( t ) . 

Based  upon  the  application  of  the  Bellman's  Principle 
of  Optimality  and  functional  equations,  dynamic  programming 
is  used  to  solve  the  optimal  control  problem  formulated  in 
Eqs . (2.2.1)  and  (2.2.8). 


2 . 3 Problem  Solution 

The  solution  to  the  optimal  control  problem  given 
in  Eqs.  (2.2.1)  and  (2.2.8)  can  be  obtained  by  applying  the 
standard  dynamic  programming  method.  The  cost-to-go  at  the 
final  time  is  given  by 

V(x(N) , N ) = F x2(N)  (2.3.1) 

By  the  Principle  of  Optimality 


V ( x( N-l ) , N-l ) = min  E j Q( N-l )x2( N-l ) + R(N-1 )u2(N-l ) 

u(N-l)  a(N-l) , > 
b(N-l) . 

S(N-l) 

+ V(x(N),N)|xN_1 j 

» min  n T Q(N-1 ) + F(  a2(N-l  ) + E (N-l ) )~|x2(N-l  ) 
u(N-l)|L  n -I 

+ j^R(N-l)  + F(b2(N-l)  + Ebb(N-l))J  u2(  N-l ) 

+ 2F(a(N-l)b(N-l)  + Zab( N-l ) )x(N-l )u(N-l )| 


+ F S (N-l) 


(2.3.2) 
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since  £(N-1)  is  independent  of  u(N-l)  and  x(N-l)  and  the 
random  parameters  a(N-l)  and  b(N-l). 

We  minimize  the  algebraic  expression  in  Eq . (2.3.2) 
by  taking  the  derivative  with  respect  to  u(N-l)  and  setting 
it  to  zero,  we  obtain  as  a result 

* F(a(N-l)b(N-l)  + E . (N-l ) ) 

u (N-l)  = - — x(N-l)  (2.3.3) 

(b  (N-l ) + E (N-1))F  + R(N-l) 


Substituting  this  optimal  control  at  N-l  into  cost 
Eq.  (2.3.2)  the  optimum  cost-to-go  becomes 

V ( x( N-l ) , N-l ) = x2(N-1)K(N-1)  + F 5 (N-l ) (2.3.4) 


where 


K(N-l)  = F(E  (N-l)  + a2(N-l)>  + Q(N-l) 

cl  cl 


- G2( 


N-l)  [f 


R(N-l)  + F(b2(N-l)  + Ebb(N-l)) 


J (2.3. 


5) 


F [a(N-l)b(N-l)  + (N-l )] 

G(N_1)  = — — 

R(N-l)  + F(bz(N-l)+E  (N-l)) 


(2.3.6) 


We  note  that  the  optimum  cost-to-go  at  time  N-l  is 
of  the  same  form  as  Eq . (2.3.1).  The  second  term  is  due  to 
the  additive  noise  driving  the  system.  The  first  term  in- 
cludes the  cost  of  control  and  implicitly  the  added  cost 
due  to  the  randomness  of  the  parameters  a(N-l)  and  b(N-l). 
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At  time  N-2,  the  cost-to-go  is  given  by  the  equa- 


V ( x( N-2 ) , N-2 ) = min  E ] Q( N-2 )x2( N-2 ) + R( N-2 ) u2( N-2 ) 

u( N-2 ) I 


+ V(x(N-l) ,N-1) |XN  2J 


min  E j Q(N-2).\J(N-2)  + R(  N-2 ) u2(  N-2  ) 
u( N-2 ) f 

+ K(N-l)x2(N-l) |XN_2J  + F 5 ( N-l ) 


(2.3.7) 

This  expression  for  the  cost-to-go  is  identical  to 

that  in  Eq . (2.3.2)  except  for  the  time  indexes.  Therefore, 

* 

the  optimal  control  u (N-2)  is  given  by 


u (N-2)  = 


K( N-l ) ( a ( N-2 )b( N-2 ) + E ,(N-2)) 

11  D 

(b2(N-2)  + Ebb(N-2))K(N-l)  + R(N-2) 


x( N-2 ) 


and  the  optimal  cost-to-go  is  given  by 


V* (N-2 , x(  N-2 ) ) = K(N-2)x2(N-2)  + K(N-l)  5 (N-2) 


+ F 5 (N-l) 


where 


K( N-2 ) = K(N-1 )( a tN-1 ) + E (N-l))  + Q(N-2) 

ati 

K2(N-l)(a(N-2)b(N-2)  + Eftb(N-2))2 

R( N-2  ) + K( N-l ) ( b2(N-2 ) + E (N-2)) 

bb 


(2.3.8) 


(2.3.9) 


(2.3.10) 


By  induction  on  t,  we  obtain  the  solution  to  the 
stochastic  state  regulator  problem.  Given  the  linear  stochastic 
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system  Eq.  (2.2.1)  and  the  cost  functional  Eq . (2.2.8), 
where  u(t)  is  not  constrained,  the  optimal  feedback  control 
at  each  instant  of  time  is  given  by  a linear  transformation 
of  the  state, 

u*(t)  = -G(t)  x(t)  (2.3.11) 

where 

K( t + 1 ) ( E ( t ) + a( t )b( t ) ) 

G(t)  = — -k (2.3.12) 

R(t)  + (Ebb(t)  + b‘:(t))K(t+l) 

and  K(t)  is  the  solution  of  the  Riccati-like  equation 

K(t)  = ( a2( t ) + £ ( t ) )K( t+1 ) + Q(t) 

aa 

- G2(t)  |~R(  t ) + K(t  + l)(£bb(t)  +b2(t))J  (2.3.13) 

satisfying  the  boundary  condition 


K(N)  = F 


(2.3.14) 


The  state  of  the  optimal  system  is  then  the  solution 
of  the  linear  difference  equation 


x(t+l)  = 


a(t)  - b(t) 


K(t+l)(£&b(t)  + a( t )b( t ) ) 
R(t)  + K(t  + l)(Ebb(t)  + b2(t)) 


x( 0)  = x 


0 


x(  t ) 

(2.3.15) 


The  optimal  control  given  by  Eq . (2.3.11)  is  a 
random  variable  since  x(t)  is  a random  variable.  It  is 
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linear  in  the  completely  measurable  state.  The  uncertainty 
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in  the  parameters  a(t)  and  b(t)  introduces  equivalent  state 
and  control  weightings,  Ea&(t)K(t  + l)  and  ( t )K(  t + 1 ) , 
respectively  in  a very  natural  way  into  the  control  problem. 

In  order  for  the  extremal  control  to  be  the  unique 
optimal  control,  we  need  to  show  that  the  second  partial 
derivative  of  T with  respect  to  u, 

R(t)  + (Ebb(t)  + b2(t))K(t+l)  > 0 (2.3.16) 

The  solution  to  the  Riccati-like  Eq . (2.3.13)  is  non-negative 
definite.  This  can  be  seen  from  the  fact  that  for  any  x. 

x2K(t)  = min  E Tx2Q(t)  + u2R(t) 
u L 

+ ( a( t )x  + b( t )u)2K( t+1 ) J , 

K(N)  = F > 0 (2.3.17) 

Since  F,Q(t)  >0  and  R(t)  >0,  the  expression  within  the  bracket 

is  non-negative.  Since  the  minimization  over  u preserves  non- 

o 

negativity,  it  follows  that  x K(t)  >0  for  all  x.  Hence,  K(t) 
is  non-negative  definite.  Since  R(t)  is  positive  definite, 
we  conclude  that  [R(t)  + (£b^(t)  +b2(t))K(t  + l)]  > 0. 

The  Riccati-like  Eq . (2.3.13)  is  a first-order  non- 
linear time-varying  ordinary  difference  equation,  the  solution 
K(t)  exists  and  is  unique.  The  external  control  given  by 
Eq . (2.3.11)  is,  therefore,  the  unique  optimal  control. 

The  optimal  cost-to-go  is  obtained  by  substituting 
the  expression  for  the  optimal  control  Eqs.  (2.3.11)  and 
(2.3.12)  into  Eq.  (2.2.8)  to  get 


* 
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* 9 N'1 

J (x(t),t)  = K(t)x^(t)  + l K( T+l ) 5 (t)  (2.3.18) 

T = t 

If  the  optimal  control  u(t)^0  for  all  states  then  K(t)>0 
for  all  0<t<N.  This  follows  from  the  fact  if  u(t)^0,  then 

the  cost  T must  be  positive.  We  shall  say  that  an  optimal 

* 

control  exists,  when  J is  defined  for  all  x(t)  and  t. 

Figure  2.1  shows  the  structure  of  the  optimal  feed- 
back system.  Since  the  optimal  control  is  u( t ) = -G( t )x( t ) , 
the  state  x(t)  is  multiplied  by  the  linear  gain  G(t)  to  gen- 
erate the  control.  The  optimal  feedback  system  is,  thus, 
linear  and  time-varying  in  the  finite  horizon  problem.  This 
will  be  the  case  even  if  the  system  is  stationary  and  the  cost 
functional  is  time-invariant.  Note  that  the  optimal  control 
given  by  Eqs.  (2.3.11)  to  (2.3.13)  is  modulated  by  the  co- 
variances  of  the  purely  random  (white)  parameters.  The  optimal 
controller  is  cautious  when  the  parameter  b(t)  is  uncertain. 

The  gain  G(t)  is  smaller  in  magnitude,  ceteris  paribus,  than 
the  linear-quadratic  gain.  The  controller  is  more  vigorous 
when  the  parameter  a(t)  is  uncertain,  since  the  controller 
must  be  more  active  to  regulate  the  system.  The  gain  G(t) 
are  larger  in  magnitude,  ceteris  paribus,  with  larger  vari- 
ance Z (t). 
aa 

Since  the  gain  G(t)  is  a function  of  K(t),  the 
solution  K(t)  to  the  Riccati-like  Eq . (2.3.13)  governs  the 
behavior  of  the  optimal  feedback  system.  The  Eq . (2.3.13) 


Figure  2.1  Optimal  controller  for  system  equation  (2. 
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; 

- 


is  nonlinear  and,  in  general,  we  cannot  obtain  closed-form 
solutions.  We  shall  discuss  in  the  next  section  the  solution 
K ( t ) to  Eq . (2.3.13)  as  N -*•  00  to  obtain  a steady-state  con- 
troller for  the  stationary  system  and  cost  functional  with 
constant  weightings. 

We  remark  that  the  optimal  control  law  given  by 
Eqs.  (2.3.11)  to  (2.3.13)  is  not  the  Certainty-Equivalent 
control,  since  the  control  gain  depends  on  the  parameter 
variances.  The  Certainty-Equivalent  control  law  is 


C. E . . . x 
u (t)  = - 


b( t )K( t+1 )a( t ) 
b2( t )K( t+1 ) + R(t) 


x(t) 


(2.3.19) 


where 


b2(t)K2(t+l)a2( t! 


K(t)  = a ( t )K(  t+1 ) + Q(t)  - 

b (t)K(t+l)  + R(t) 


(2.3.20) 


This  can  be  obtained  from  Eqs.  (2.3.11)  to  (2.3.13)  by  setting 

arbitrarily  £ (t)  = £,,  (t)  = E (t)  = 0.  The  Certainty- 

aa  bb  ab 

Equivalence  control  law  does  not  account  for  the  uncertainty 
in  the  system  parameters. 

The  optimal  stochastic  control  is  without  posterior 
learning.  The  parameters  a(t)  and  b(t)  cannot  be  identified, 
because  by  assumption  they  are  white.  Nature/chance  picks 
the  parameters  and  the  controller  must  adapt  to  the  structural 
change.  This  is  a worst-case  control  system  design,  as  com- 
pared to  assuming  the  parameters  are  unknown  but  constant  or 
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slowly  time-varying.  However,  the  assumption  of  purely 
random  parameters  is  unrealistic  from  a physical  point  of 
view.  The  assumption  that  the  parameters  are  unknown  but 
constant  leads  to  the  well-known  dual  control  problem  whose 
exact  solution  cannot  be  easily  computed  analytically.  The 
white  parameter  assumption  leads  to  a very  simple  stochastic 
control  law  Eq.  (2.3.11)  that  can  be  easily  implemented. 
Economists,  and  in  particular  Chow  [38]  have  argued  that  in 
economic  systems,  treatment  of  unknown  parameters  as  being 
purely  random  is  desirable  to  obtain  the  inherent  caution 
in  the  control  especially  when  b(t)  is  not  known  accurately. 
In  [32] , Athans  and  Varaiya  have  argued  that  the  control  of 
systems  with  white  parameters  represents  a worst-case  situa- 
tion in  which  the  ratio 


lt(0'£a^0-  W°-  W0) 

K(°l’Eaa-°’  Ebb  = 0’  Eab'0) 


> 1 


(2.3.21) 


provides  a measure  of  the  deterioration  in  performance  due 
to  the  unknown  parameters,  which  can  provide  a guide  as  to 
whether  sophisticated  parameter  estimation  and  adaptive 
control  algorithms  are  warranted. 


2.4  Asymptotic  Behavior 

We  assume  in  this  section  that  the  stochastic  linear 
system  given  by  Eq . (2.2.1)  has  wide-sense  stationary  statistics 


- - - ..  • I— 
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The  state  and  control  weightings  Q( t ) and  R(t)  are  assumed 
to  he  constant. 

The  Ri ceat i- like  Eq . (2.3.13)  is  then  given  by 

K2(t+l)(a  b+£  )2 

K(t)  = Q + K(  t+l)(aZ  + E ) - ^ (2.4.1) 

(b  +£bb)K(  t + D + R 

K(N)  = 0 

Since  the  nonlinear  difference  Eq . (2.4.1)  has  con- 
stant parameters,  one  may  well  think  that  it  will  attain  a 
steady-state  solution  "backward  in  time"  as  it  certainly  does 
for  the  ordinary  linear-quadratic,  problem  with  known  param- 
eters, so  that  one  can  then  calculate  the  infinite  horizon 
(constant)  gain.  This  is,  however,  not  the  case  for  Eq . 
(2.4.1). 

Figures  2.2,  2.3,  and  2.4  show  the  numerical  solu- 
tion of  Eq . (2.4.1)  for  N = 50  for  different  values  of  means 
and  covariances  of  the  parameters.  Note  the  logarithmic 
scale  used.  A close  examination  of  Eq . (2.4.1)  shows  what 
can  happen  to  the  solution  K(t)  of  the  Riccati  equation. 

Consider  then  Eq . (2.4.1)  and  assume  that  K(t+1) 
is  "large".  Then  the  "backward  in  time"  evolution  of  K(t) 
is  given  approximately  by 

K(t)  s K(t+l)m  (2.4.2) 


where  the  threshold  parameter  m is  given  by 

(E. 


= E + a2  - 
a a 


ab 


+ a b)2 


£,  + b2 

bb 


nt 


(2.4.3) 
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or 


m = 


EE,,  + E b2  + E,  .a2  - E2  - 2E  . ab 
aa  bb aa bb ab ab 

+ 52 


(2.4.4) 


Clearly,  from  Eq . (2.4.2)  K(t)  will  undergo  expo- 
nential growth  "backward  in  time"  if 

m > 1 (2.4.5) 

From  the  expression  in  Eq.  (2.4.3)  or  (2.4.4)  one  can  see 
that  there  are  certain  combinations  of  the  parameter  means 
and  covariances  that  will  yield  the  inequality  condition  in 
Eq . (2.4.5).  Hence,  we  can  immediately  arrive  at  the  con- 
clusion that  in  the  case  of  optimal  stochastic  control  with 
purely  random  (white)  parameters,  a well-behaved  solution  to 
the  infinite  horizon  problem  may  not  exist. 

A different  insight  can  be  provided  by  examining 
the  dependence  of  the  optimal  cost  upon  the  planning  horizon. 
Figure  2.5  shows  the  behavior  of  the  optimal  cost  versus  time 
N.  Note  that  if  the  threshold  parameter  m>l  then  the  optimal 
cost  grows  exponentially, 


J* ( N ) = x2(0)  emN  , m > 1 (2.4.6) 

Otherwise  (m<l)  the  optimal  cost  remains  bounded  and  finite. 
Now,  suppose  that  Eq . (2.4.1)  has  a steady-state 

a 

solution  given  by  K satisfying  the  algebraic  equation 


~ * —2  K2(E  +ab)2 

K = K(a^  + E ) + Q - 


R^KUb^h2) 


(2.4.7) 


OPTIMAL  COST 
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Note  that  K must  be  positive  definite.  The  solution  to  the 
quadratic  equation  is  then  given  by 

I!  - (R< raa  + a2  - 1)  + Q<Ebb  + b2)) 

- ["“V”2'1’  - «<£bbtE2))2  + 4«R<!:ab+SE)2]1/2| 


' [2(<Eaa+a2-1)(!:bb  + b2)  - (C^  + ab)2)]  (2.4.8) 


The  limiting  solution  K is  positive  if 


„ , -2  <£ab  + ab)  . . 

£aa  + a I 7=2-  ' 1 

£bb  + b 


m < 1 


(2.4.9) 


(2.4.10) 


We  state  the  following  result. 

Theorem  2.1 

The  unique  positive  solution  to  the  infinite  horizon 
problem  given  by  Eqs.  ( 2 . 2 . 1 )-( 2 . 2 . 7 ) exists  if  and  only  if 
m < 1 . 

Proof:  (=*•)  we  rewrite  the  Riccati-like  Eq . (2.3.13),  re- 

versing the  time  index;  as 

-2  (Iab  + IE)2 

K(  t + 1 ) = Q + K(  t ) (E  + a j - — ^ 

Zbb  + b 


(I.+ab)2  K2(t) 

+ -a--  K(t)  - - 

E..+b2  / 

bb  L Ub  + b 


+ K(t) 


(2.4.11) 


Since  the  third  term  is  non-negative  definite  (R>0), 


K(  t + 1)  > Q + K(t)m 


l Q 


(2.4.12) 


It  follows  immediately  that  if  m>l,  then  K(t)  diverges  as 


t -V  oo  , 


Since  the  third  term  is  monotone  increasing  in  K(t), 


it  follows  that  K(t)  is  monotone  increasing  for  K(0)=Q.  Let 


M(  t)  = K(t) 


o 

K (t 


» 


+ K(t) 


(2.4.13) 


Note  that  M(t)  is  also  monotone  for  positive  R.  Thus  there 


exists  an  ot  > 0 such  that 


F + 

_ 1 . bb  D . -1 

M (t)  ~ Kft)  R - a 


(2.4.14) 


from  which  we  have  that  M(t)  is  uniformly  bounded  in  K(t), 


that  is, 


M(t)  < a , a > 0 


(2.4.15) 


It  follows  from  Eqs . (2.4.11)  and  (2.4.15) 
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-2  <Eab  + ab) 

K(  t + 1 ) < Q + ( £ + a ; ^ =—  K(t) 

+ b2 
bb 


<i:abtaE>2 

Ebb  + 52 


S l ( Q + a 
1=0  \ 


<Eab  + ab> 
Ebb+s2. 


(2.4.16) 


so  that  K(t)  is  bounded  as  t ■*  °°  because  m<  1. 

Since  there  is  a sharp  dividing  line,  quantified  by 
the  means  and  covariances  of  the  random  parameters,  between 
the  cases  that  the  optimal  stochastic  control  exists  or  does 
not  exist  for  the  infinite  horizon  case  (see  Fig.  2.6)  it  is 
obvious  that  there  is  a fundamental  limitation  to  optimal 
infinite  time  quadratic  control  problem.  We  call  this 
phenomenon,  the  Uncertainty  Threshold  Principle.  This  result 
has  several  implications  in  engineering  and  socioeconomic 
systems,  since  it  points  out  there  is  a clear  quantifiable 
boundary  between  our  ability  of  making  optimal  decisions  or 
not  (in  the  sense  that  the  optimal  cost  is  bounded)  as  a 
function  of  the  parameter  modeling  uncertainty. 

Katayama  [51]  has  pointed  out  this  instability 
problem  when  b(t)  is  random  in  a multivariable  system.  For 
continuous-time  systems  the  existence  of  solutions  has  been 


investigated  by  Bismut  [45] , but  only  for  finite  horizon 
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problems.  In  related  problems  involving  control-dependent 

noise,  Kleinman  [41]  assumed  the  existence  of  a solution. 

In  the  case  of  known  parameters  ( E = EL1_  = E , =0) 

aa  bb  ab 

Eq . (2.4.4)  yields  m = 0.  This  is  the  reason  why  there  is  no 
problem  with  the  stationary  solution  for  standard  linear 
quadratic  problem. 


In  the  case  where  a(t)=a  (E  =0  = £ .),  Eq . (2.4.4) 

3a.  aD 


yields 


v x -2 
£bb  * a 

*bb  * 52 


(2.4.17) 


_2 

so  that  as  long  as  a is  less  than  or  approximately  equal  to 
one,  then  m<l  and  there  is  no  convergence  problem  for  the 
solution  K(t)  to  the  Riccati-like  Eq . (2.4.1),  (see  Fig.  2.7). 
This  may  possibly  explain  Kleinman 's  results  [41]  on  control- 
dependent  noise  problems  and  their  application  for  pilot 
models  controlling  stable  aircraft.  This  is  also  the  same 
stability  condition  derived  by  Katayama  for  random  gains  [51] . 

In  the  case  where  b(t)=b  ( = 0 = E&^) , Eq . (2.4.4) 

yields  m = E . This  implies  that  independent  of  the  average 

3.(1 

values  of  a and  b,  as  long  as  the  variance  E of  the  "time 

33 

constant"  a of  the  system  exceeds  unity,  then  one  is  in 
trouble  for  long  horizon  planning  problems,  even  for  systems 
that  are  stable  on  the  average  (|a|  <1).  This  result  seems 


to  state  that  when  the  standard  deviation  of  the  parameter 
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a(t)  is  greater  than  unity,  then  the  system  is  statistically 
mean-square  unstable,  and  under  these  conditions,  one  cannot 
stabilize  the  system.  This  provides  a tie  with  the  literature 
on  stochastic  stability  with  state-dependent  noises  ([52], [53]). 

From  Eq.  (2.4.3),  it  is  evident  that  a non-zero 
parameter  correlation  (Eab>0)  always  reduces  the  value  of 
m,  and  hence  it  helps  prevent  (up  to  a point)  the  divergence 
of  K( t ) . From  a modeling  viewpoint,  this  implies  that  a 
careful  modeling  of  the  relationship  of  the  joint  statistics 
in  the  coefficients  that  multiply  the  state  variables  and 
those  that  multiply  the  control  variables  can  only  help. 

Suppose  that  the  threshold  parameter  m < 1 so  that  a 
steady-state  K exists,  then  the  steady-state  control  gain 
given  by  m**' 

k[e  + a b| 

G = lim  G(t)  = (2.4.18) 

N-*°°  R + K(  E,  . + u) 

bb 

is  well-defined.  Since  the  gain  G(t)  is  constant,  the  re- 
sulting optimal  system  will  be  linear  and  constant;  from 
engineering  point  of  view,  such  an  optimal  controller  would 
be  very  simple  to  construct  for  stationary  systems. 

Next,  suppose  that  b = 0,  so  that  the  system  (2.2.1) 
is  "most  uncontrollable  on  the  average".  Note  that  G^O  and 
u(t)f*0  provided  that  the  correlation  E&b^0.  This  means 
heuristically  that  the  random  time  constant  system  is 


r 


— . — — — 
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controllable  in  a stochastic  sense;  the  nonzero  covariance 


Eab  means  that  a(t)  and  b(t)  "swing  together"  and  this  implies 


that  we  can  still  control  a system  which  is  "most  uncontrol- 
lable on  the  average".  This  observation  seems  to  suggest  a 
new  concept  of  "stochastic  controllability". 

Note  that  in  the  case  b = 0,  the  uncertainty  threshold 
parameter  m is  given  by 

2 

a2  (2.4.19) 


E E.  - E 0 

_ _ aa  bb  ab  . —2 
m = = + a 


Jbb 


In  view  of  the  fact  (2.2.7),  this  "stochastic  controllability" 
is  possible  only  for  systems  that  are  stable  on  the  average 
( | a | <1),  otherwise  m>l  (see  Fig.  2.8). 

Suppose  now  that  the  threshold  parameter  m>l,  so 
that  the  optimal  cost  given  by  Eq.  (2.3.14)  grows  exponentially 
with  the  time  horizon  N.  The  control  gain  remains,  however, 
a well-defined  quantity,  and  is  given  by  the  constant  value 


G = 


£ab  + a b 


^bb  + »2 


(2.4.20) 


which  is  obtained  by  letting  K(t+l)->-“  in  Eq . (2.3.13).  One 
could  argue  that  there  is  an  optimal  limiting  gain  in  the 
sense  that  one  is  still  trying  to  do  his  best  so  as  to  mini- 
mize the  rate  of  the  exponential  growth  of  the  optimal  cost 
j with  increasing  horizon  N (see  Fig.  2.5). 
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To  see  further  the  implication  of  this  philosophy 

A 

one  can  substitute  the  gain  G in  the  system  dynamics  Eq . 

(2.2.1)  and  obtain  the  stochastic  control  system 

x ( t + 1 ) = ( a ( t ) - b(t)  G ) x ( t ) (2.4.21) 

Under  the  assumption  that  x(t)  can  be  measured  exactly  the 
mean  x(t)=E{x(t)}  will  propagate  (in  an  open-loop  sense)  as 

x(t+l)  = (a  - bG)x(t),  x(0)=x(0)  (2.4.22) 

The  state  error  covariance 

Exx(t)  = E{x(t)  - x(t)}2  (2.4.23) 

can  then  be  shown  to  propagate  according  to 


E (t+1) 
xx 


m E ( t ) 
xx 


21  ,(E  +a  b)  ( I.  . +b2)  + E ( E ,+ab)2 
ab  ab  bb  ' bb  ab  

<Ebb  + E2>2 


x2(t) 


Exx(0)  = 0 (2.4.24) 

where  m is  the  threshold  parameter  given  by  Eq . (2.4.13). 

It  is  clear  that  if  m>l  in  Eq . (2.4.24)  then  the 
open-loop  propagation  of  the  variance  of  the  state  E (t)  is 
unstable.  Essentially,  this  says  that  although  the  steady- 
state  control  is  well-defined  by  a constant  gain  Eq . (2.4.20), 
and  the  closed- loop  system  of  Eq . (2.4.21)  can  be  implemented, 
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the  variability  of  the  state  as  measured  by  its  variance 
"blows  up"  as  t becomes  large. 

A sufficient  condition  that  will  ensure  that  the 
inequality  (2.4.10)  will  be  met  is 

Eaa  + a2  < 1 (2.4.25) 

This  condition  is  both  a necessary  and  sufficient  condition 
for  the  asymptotic  variance  of  the  uncontrolled  linear  system 

x(t+l)  = a(t)x(t)  (2.4.26) 

to  be  finite,  and  thus  turns  out  to  be  sufficient  to  ensure 
that  an  optimal  control  exists  as  well. 

2 . 5 Stochastic  Stability  Results 

We  want  to  now  analyze  the  optimal  control  problem 
posed  in  Section  2.2  from  an  alternative  point  of  view  and 
arrive  at  exactly  the  same  conclusions.  The  approach  treats 
the  stochastic  control  problem  as  essentially  a mathematical 
problem,  that  is,  stochastic  difference  equation  and  we  will 
consider  the  stochastic  stability  of  such  system  under  feed- 
back. Asymptotic  stability  of  linear  stochastic  systems 
with  random  coefficients  have  been  considered  in  [52]  to  [57] . 
Consider  the  first-order  linear  dynamical  system 

x(t+l)  = a(t)x(t)  + b(t)u(t)  (2.5.1) 

One  can  include  additive  white  noise  driving  the  system 
dynamics,  but  the  stability  result  is  unchanged  from  the 
deterministic  case.  The  question  we  want  to  deal  with  is 
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whether  or  not  the  system  Eq . (2.5.1)  is  stabilizable  under 
feedback  when  a(t)  and  b( t ) are  assumed  to  be  random  coeffi- 
cients. 


Let 


u ( t ) = g( t ) x( t ) 


(2.5.2) 


Thus  the  closed-loop  system  will  propagate  according  to  the 
stochastic  equation. 

x(t+l)  = £a(t)  + g(t)  b(t)Jx(t)  = c ( t ) (2.5.3) 

If  a(t)  and  b(t)  are  uncorrelated  in  time,  one  can  calculate 
the  ratio 


-{--4-+1- } = E{c2(1)}E{c2(2)}.  . .E(c2(t)}  = S ( t ) (2.5.4) 

E{ x ( 1 ) } 

The  value  of  S(t)  is  a measure  of  how  the  second  moment  of 
the  state  propagates  in  time.  The  larger  the  value  of  S(t), 
the  more  variable  the  state  is.  In  particular  if 

lim  S(t)  -*•  °°  (2.5.5) 

t-*-0O 

the  system  (2.5.3)  is  unstable  in  the  mean  square  sense. 

The  value  of  S(t)  will  be  influenced  in  part  by  the 
value  of  the  feedback  gain  g(t)  in  Eq.  (2.5.2).  So  one  can 
seek  the  value  of  g(t)  which  will  minimize  the  ratio  S(t) 
in  Eq.  (2.5.4). 

The  product  S(t)  is  minimized  if  each  element  of 
the  product 

E{  c2(  t ) } = EUa(t)  + g(  t ) b(  t ) 1 2 } 


(2.5.6) 


is  minimized  by  g(t). 


Since 


E{c2( t ) } = E{a2( t ) } + g2(t)  E{b2( t ) } + 2g( t ) E{a(t)b(t)} 

(2.5.7) 

therefore,  the  best  value  of  g(t)  is  obtained  by  algebraic 
minimization  which  yields 


♦ * Eab  + a b 

g = g (t)  5-  = constant 


(2.5.8) 


Hence  the  minimum  value  of  E(c  (t)}  is  given  by 


E{c2*( t ) } = E{ [a( t ) + g*  b( t ) ] 2 } 


r ,-2  <Eab*ab) 

~ Eaa  + a ~ 7=2 " 

Ebbtb 


(2.5.9) 


where  m is  the  undiscounted  threshold  parameter  given  by 
Eq.  (2.4.3). 

It  follows  that 

S*(t)  = m*  (2.5.10) 

and  hence  that 

lim  S*(t)  < * if  m < 1 . (2.5.11) 

t-bOO 

We  state  the  results  in  the  following  theorem. 

Theorem  2 . 2 

The  stochastic  system  in  Eq . (2.5.1)  is  stabilizable 
by  linear  feedback  in  a mean-square  sense  if  and  only  if  the 
uncertainty  threshold  parameter  m,  defined  by  Eq . (2.4.3)  is 
less  than  unity. 
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i 


* 

We  note  that  the  minimum  variance  gain  g in 
Eq . (2.5.9)  is  the  same  as  G in  Eq . (2.4.20)  where  we  con- 
cluded that  the  limiting  control  gain  is  a constant  and  the 
feedback  system  can  be  implemented.  The  feedback  system  may- 
or may  not  be  stabilizable  under  feedback  depending  on 
whether  or  not  the  threshold  parameter  m<l  is  satisfied. 

The  stochastic  stability  analysis  resulted  in  an 
optimal  gain  g(t)  given  by  Eq . (2.5.8)  which  is  identical  to 
Eq.  (2.4.20).  It  yields  the  sufficient  condition  for  optimal 
control  to  exist.  Since  we  are  considering  mean-square 
stability,  we  could  have  obtained  the  same  gain  by  setting 
R = 0 in  the  cost  functional  Eq . (2.2.8);  and  then  Eq . (2.4.18) 
becomes  Eq . (2.5.8).  The  stochastic  stability  condition  is 
thus  independent  of  the  numerical  solution  K. 

Following  Kozin  [58] , we  consider  now  the  "almost 
sure  stability"  analysis  (sample  path  stability)  of  the 
stochastic  linear  system  Eq . (2.5.1)  under  feedback  Eq . 

(2.5.2) . 

Definition  2.5.1.  The  equilibrium  solution  x(t)=0  of  the 
system 

x(t  + l)  = (a(t)  + b(t)  g(t) ) x(t) 

= c ( t ) x ( t ) (2.5.12) 


where 


x(0)  = xQ  is  a random  variable 


i s almost  surely  stable  i f 


l 

» 

I 


— — - r 


mmnmm 
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lim  P 

sup 

sup 

x(t  ,0)) 

> e f = 0 

(2.5.13) 

6-0  i 

1 | <6 

t>0 

for  any  given  e > 0 and  6(e,0)  > 0. 

For  discrete-time  systems,  an  equivalent  condition 
is  given  in  [59] , 

Definition  2.5.2.  The  equilibrium  solution  x(t)  = 0 of  the 
system  Eq . (2.5.12)  is  almost  surely  stable  if  for  e >0 


lim  pjsup 
|x0|-0  *t>0 


x(t) 


0 


(2.5.14) 


Accordingly,  Konstantinov  in  [59]  proved  the  following 
Theorem  2 . 3 

The  solution  x(t)=0  of  the  system  (2.5.12)  is  almost  surely 

stable  for  t >0  if  there  exists  a function  V(t,x)  cD^  (domain 

of  definition)  which  for  t>0  satisfies  the  conditions 

(i)  V(t,x)  is  continuous  at  x = 0 and  V(t,0)=0 

(ii)  inf  V(t,x)  > a(6)  > 0 for  any  6>0 
|x|>6 

(iii)  L[V(t,x)]  <0  in  some  neighborhood  of  x = 0. 

A suitable  Lyapunov  function  to  use  is 

V(t,x)  = x2(t)  (2.5.15) 

Then  condition  (iii)  in  Theorem  2.3  says  that 

E{V( t+1 ,x)  - V ( t , x ) } < 0 (2.5.16) 

and  using  Eq.  (2.5.12) 

a2  + 2abg(t)  + b2g2(t)<l  (2.5.17) 

We  now  show  that  for  |a  + bg(t)|<l,  then  almost  every 
sample  sequence  {x(t)l  would  approach  zero.  Following  [54], 


we  have 
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Theorem  2.4 


The  equilibrium  solution  of  Eq . (2.5.12)  is  almost  surely 
stable  if  |a  + b g|  < 1 . 

Proof:  We  must  show  that 


lim  p]  sup  sup  |x(t,o)  )|  > e!  = 0 

6-0  *|Xq|<6  t>0  ' 


(2.5.18) 


lim  P •]  sup  sup  |x(t,w)|  > c [ 

6-0  ' | xQ | < 6 t>0  1 

= lim  p]  sup  sup  | <j>(  t , 0)  | | x | > e | ( 

6-0  < | xQ | < 6 t>0  U ' 

where  4>  ( t , 0 ) is  the  solution  of  the  difference  equation 

♦ (t+1,0)  = C(t)  <Kt,0)  < 

Hence,  Eq . (2.5.19)  becomes 


(2.5.19) 


(2.5.20) 


lim  p|sup  | ( t , 0 ) | > 4 | < (2.! 

6-0  *t>0  ’ 

lim  p|  sup  1 4>(  t , 0)  | > ■§[  + P-j  sup  | 4>  ( t , 0 ) | > 

6-0  [ fO<t<T(co)  > t>T(  w) 


(2.5.21) 


We  note  that 


1/  -V. 

*(t,0)  = n (a(r)  + b(T)g) 

T = 0 


(2.5.22) 


Therefore,  the  first  term  in  Eq . (2.5.21)  is  given  by 


r 
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lim  p]  sup  |<Ht,0)  | > 

6-*-0  « 0<  t<T(u) 

= lim  Pj  sup  1 0 ( t . 0)  | > ne| 
<0<t<T(<i))  ' 


[,il“ 


sup  | <j»(  t , 0)  | > 
0< t<T(w ) 


n e | 


since  | a + b g | < 1 . 


For  ergodic  process  in  the  parameters, 


(2.5.23) 


lim  — 4>  ( t , u) ) = E{<Kt,u>)} 
t-*-®  Z 


(2.5.24) 


Given  B>0,  there  exists  then  a random  time  Tg(w) 


such  that 


i <J>(t,u>)  - E{  <J>(  t , w)  } < 6 


for  all  t > Tg(w) . 
Since 


E{<t>(t  ,u>) } = c* 


then 


^ 4>(t,u))  < c*  + B 


for  all  t > Tg(w)  and 

4> ( t , u) ) < t(ct  + B)  almost  surely 


(2.5.25) 


(2.5.26) 


(2.5.27) 


(2.5.28) 


The  second  term  in  Eq . (2.5.21)  is,  therefore,  given 
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lim  p]  sup  U(t,0)|  > j (•  < lim  P-  sup  |(ct+&)t|  > f [ 

6->0  < t^TCw)  ° ’ 6-0  ' t>T(u))  lS  ' 

(2.5.29) 

Now  for  arbitrarily  small  6-0  for  T(w)  -foo  2 n Eqs. 

(2.5.24)  and  (2.5.25),  we  have  in  the  limit 

_Tfi(u>)  -t 
T g(u3 ) c p = (cl  + 6)t 

so  that  Eq . (2.5.29)  becomes 

lim  P 1 1 T(  w ) cT(u,)|  > jl  = 0 (2.5.30) 

6-0  f 6 ' 

since  | c | < 1 and  T(w)  belongs  to  the  positive  integers  set. 

Combining  Eqs.  (2.5.21),  (2.5.23),  and  (2.5.30)  we 
complete  the  proof. 

We  demonstrate  that  the  mean-square  stability  con- 
dition is  stronger  than  the  almost  sure  stability  criterion. 
From  Eq . (2.5.8), 


e - - 


a b 


Ebbtb2 


Substitute  this  into  Eq . (2.5.32) 


<Zab  = 0> 


(2.5.31) 


a + b g | <1 


(2.5.32) 


we  get 


a I 


bb 


Ebb*  b" 


< 1 


(2.5.33) 


which  does  not  hold  for  the  general  case  ( | a | > 1 ) . 

Since  almost  sure  stability  requires  |a  + bg|  < 1 , 


this  implies  that 
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a2  + 2a  b g + b2  g2  < 1 

Note  that  this  is  less  restrictive  than  the  mean-square 
stability  condition  given  by  Eq . (2.5.8).  Almost  sure 
stability  (pointwise  stability)  states  that  for  the  sto- 
chastic system  under  linear  feedback  Eq.  (2.5.12),  the 
equilibrium  solution  x(t)=0  is  stochastic  stabilizable . 

It  ensures  the  existence  of  a control  that  will  drive  the 
system  towards  zero  (except  for  random  fluctuations).  It 
is  different  from  the  mean-square  stability  in  that  it 
deals  with  the  ensemble  of  sample  paths  and  says  that  the 
variance  of  x(t)  is  finite  and  bounded  if  and  only  if  mil. 

2 . 6 The  Discounted  Cost  Problem 

In  this  section  we  will  consider  the  effects  of  in- 
cluding a discount  factor  in  the  objective  function.  Tradi- 
tionally, discount  factors  have  been  used  in  economic  prob- 
lems to  emphasize  the  near-term  worth  of  the  utility  func- 
tion as  compared  to  the  long-term  worth  [60] . One  may  then 
suspect  that  the  inclusion  of  the  discount  factor  in  the 
objective  function  may  increase  the  threshold  at  which  the 
optimal  control  for  the  infinite  horizon  problem  is  well- 
defined.  That  this  is  indeed  the  case  will  be  shown  in  the 
development  below. 

In  control  systems,  the  discount  factor  has  been 
used  for  infinite-time  control  problem.  Since  the  cost  is 
infinite  in  the  infinite  horizon  problem,  it  is  usually 


* 
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normalized  by  the  planning  horizon  N,  that  is,  one  con- 
siders 


i i N o 9 ) 

lim  J E [ Q x ( t ) + R u*(t)J 
N-*«>  w lt  = l \ 


(2.6.1) 


Kushner  ( [61] , pp.  152-153)  shows  that  this  can  be  closely 
approximated  by 


E l a1  [Q  x2(t)  + R u2(t)]  0 < a < 1 (2.6.2) 

t=0 


The  use  of  the  discount  factor  a guarantees  that  all  costs 
are  finite  and  prevents  J from  "blowing  up"  as  N + ®. 

We  are  given  that  the  system  is  described  by  Eqs. 
( 2 . 2 . 1 )-( 2 . 2 . 6) . We  consider  the  minimization  of  the  dis- 
counted quadratic  cost  given  by 


J = E 


j ? W 

( t=0  v 


Q x2( t) 


2 i 

R u*(t) 


(2.6.3) 


where  N is  the  planning  horizon  and  Q>0,  R > 0.  The  case 
a = 1 is  the  undiscounted  cost  problem  we  have  considered 
in  Sections  2. 2-2. 4.  The  state  x(t)  can  be  measured  exactly. 

The  solution  to  the  optimal  control  problem  is  ob- 
tained by  the  method  of  dynamic  programming.  The  deriva- 
tion follows  closely  that  given  in  Section  2.3  for  the 
undiscounted  problem  and,  hence  is  not  repeated.  We  note 
that  in  the  discounted  cost  problem,  the  dynamic  programming 
algorithm  can  be  modified  for  the  cost  functional  of  the 


f o rm 


T 


■■■■MU 
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:jaNK(x(N))  + l aZ  L (t  ,x(t)  , u(t)  , £(t))j 


0 < a < 1 


to  be 


(2.6.4) 


V(x(N)  ) = K(x(N) ) 

V(  x(  t ) ) = inf  E L(t,x(t),u(t),((t)) 
u(t)  ( 


a v(x(  t+1 ))  J 


(2.6.5) 


Theorem  2.5 

Given  a linear  stochastic  system  described  by  Eqs.  (2.2.1)  to 
(2.2.6)  and  the  cost  functional  (2.6.3),  the  optimal  feed- 
back control  at  each  instant  of  time  is  given  by  a linear 
transformation  of  the  measured  state,  that  is, 

u ( t ) = - G(t)  x ( t ) (2.6.6) 


a K(  t+1 ) ( £ . + a b) 

G(t)  = — (2.6.7) 

R + a K(t+l)(£bb  + b ) 

The  K(t)'s  satisfies  a Riccati-like  recursive  equation 
K(t)  = Q + a K( t+1 ) ( £ + a2) 

&L2L 

a2  K2(t+1)(E  . +ab)2 

- *5 , K(N)  = Q (2.6.8) 

R + ct  K(t+l)(Ibb  + b ) 

The  optimal  average  cost  is  given  by 


♦ O 11  + x i 

J = K(0)  x (0)  + l a K(t+l)S(t)  (2.6.9) 

t=0 

Proof : Use  dynamic  programming  as  in  Section  2.3. 

The  optimal  solution  given  in  Theorem  2.5  exists  for 


all  finite  horizon  N.  However,  the  solution  to  the  optimal 
control  problem  may  fail  to  exist  (in  the  sense  that  the 


I i 
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optimum  cost  is  infinite)  for  the  infinite  horizon  case. 
The  precise  result  is  stated  as  follows. 

Theorem  2.6 

Let  the  horizon  time  N go  to  °°.  Define  the  undiscounted 
threshold  parameter  by  Eq.  (2.4.3). 


„ _ ..  ,-2,  (£ab  + ab)2 

m - (Eaa  + a > ' — 7~z-2 


(2.6.10) 


Ebb  + b 


Then  the  optimal  solution  to  the  infinite  horizon  problem 

exists  if  and  only  if  m < — . 

~ a 

Proof:  Let  a - /a  a(t)  and  R = R/a,  £ = ot£  E , = /a  E . . 

' aa  aa  ab  ab 

Then  after  some  algebra,  Eq.  (2.6.8)  becomes 
K(t)  = Q + K(  t+1 ) ( E +I2) 

a a 


K2(t+l)(Eab  + a b)2 

R + K(  t+1 ) ( E.  + b2) 
bb 


(2.6.11) 


The  form  of  the  nonlinear  difference  equation  is 
identical  to  that  of  Eq . (2.4.1).  Hence  the  results  follow 
from  Theorem  2.1. 

The  above  results  imply  that  if  the  stability  con- 
dition m<  - holds,  then  the  limiting  solution  of  Eq . (2.6.8) 


a 


exists,  is  bounded,  and  approaches  a constant  value  K. 

lim  K(t)  = K 

N-»oo 


(2.6.12) 


and  it  is  the  positive  solution  to  the  algebraic  equation 
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K = Q + g( E + a2)  K 
aa 


a2  K2(Eab+  ab)2 
R + aK(Ebb  + b2) 


(2.6.13) 


and,  consequently,  the  linear  gain  G(t)  in  Eq.  (2.6.7)  also 
approaches  a constant  value 


G = lim  G(t) 
N+°° 


g K(  Eab  + a b) 

R + g K(Ebb  + b2) 


(2.6.14) 


Otherwise,  lim  K(t)  is  not  bounded,  and,  K(t)  grows 


exponentially  as 


lim  K(t)  = eamN  (2.6.15) 

N-*-oo 


We  remark  that  in  the  discounted  problem,  the  more 
the  future  cost  is  discounted  (a  + 0)  the  more  uncertainty 
can  be  tolerated  in  the  randomness  of  the  parameters  and 
still  have  an  optimal  solution  for  the  infinite  horizon 
problem. 

Thus  in  the  case  that  the  solution  exists  (m  < — ) 

~ n 

the  use  of  the  optimal  control  laws  Eq . (2.6.6)  where  G(t) 
is  the  constant  gain  given  by  Eq.  (2.6.14)  will  result  in 
the  following  optimum  evolution  of  the  state  x(t), 


x( t+1) 


a(t) 


a K(Eab  + a b) 

R + gK(Ebb  + b2) 


b(t) 


x(  t) 


(2.6.16) 


One  may  suspect  that  the  existence  of  an  optimal  control 

in  the  case  m<—  results  in  the  feedback  stabilization 
g 

according  to  Eq . (2.6.16).  This  is  not  true.  We  will  now 
show  that  the  optimal  closed-loop  system  (2.6.16)  is  unstable 
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1 


in  a mean-square  sense  in  the  region  1 < m < — in  spite  of 


the  existence  of  an  optimum  control  in  the  region  specified 
above . 

Recall  that  the  stochastic  system  Eq . (2.2.1)  is 
stabilizable  if  and  only  if  the  undiscounted  threshold 
parameter  m defined  in  Eq . (2.4.3)  is  less  than  unity.  This 
holds  for  any  stochastic  linear  system  and  any  linear  feed- 
back control  law.  Applying  the  Theorem  2.2,  the  optimal 


closed-loop  system  of  Eq . (2.6.16)  is  not  stable  in  a mean- 


square  sense  in  the  region  1 <m£—  , where  a is  the  discount 


factor . 

This  is  a very  interesting  and  important  result. 

The  implications  of  the  above  results  are  best  understood  by 
referring  to  Fig.  2.9a.  The  undiscounted  threshold  param- 
eter m can  be  thought  as  a measure  of  the  system  parameter 
uncertainty,  since  for  any  given  mean  values  a and  b of  the 
random  parameters  a(t)  and  b(t),  m increases  monotonical ly 


with  both  parameter  variances  E and  E . . Note  that  m is 


aa  " Jbb 

uniquely  characterized  by  the  stochastic  system  itself  and 


is  independent  of  the  performance  criterion  J used.  For 
any  given  discount  factor  a,  if  the  system  uncertainty  is 
large  enough  (Region  C in  Fig.  2.9a),  no  stabilizing 
optimal  control  exists  for  the  infinite  horizon  problem. 
If  the  system  uncertainty  is  sufficiently  small  (Region  A 
in  Fig.  2.9a)  then  the  optimal  and  stabilizing  feedback 
control  exists  for  the  inf inite-time  problem. 


0 m = 1 m = 1 /a  m 

0 < a < 1 DISCOUNT  FACTOR 
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Figure  2.9  Behavior  of  solution  as  a function  of  threshold 
parameter  m.  Legend: 

0:  Optimal  infinite  horizon • controls  exist 

N:  Optimal  infinite  horizon  controls  do  not 

exist 

S:  Closed  loop  system  stochastically  mean 

square  stable 

U:  Closed  loop  system  stochastically  mean 

square  unstable 


The  interesting  phenomenon  occurs  on  the  extended 
existence  region  B.  Note  that  the  size  of  this  region 
increases  as  the  future  is  discounted  more  and  more  (a -+()). 

In  the  extended  region  B in  Fig.  2.9a  optimal  controls  exist, 
but  the  resulting  feedback  system  is  unstable  in  the  mean- 
square  sense  according  to  Theorem  2.2.  The  existence  of  a 
unique  optimal  control  law  in  this  region  is  due  solely  to 
the  use  of  the  discount  factor  in  the  cost  functional. 

All  this  seems  to  support  a separate  analysis  to 
determine  the  stochastic  stability  conditions  of  the  under- 
lying systems  as  has  considered.  A careful  analysis  of  the 
stochastic  optimization  problem  from  the  optimal  control 
theory  and  stability  theory  are  needed  simultaneously  to 
obtain  the  stochastic  controllability  and  stability  con- 
ditions for  the  purely  random  parameter  systems.  In  most 
stochastic  control  problems  encountered,  thus  far,  opti- 
mality and  stability  present  the  same  conclusions.  Optimal 
closed-loop  control  laws  result  in  mean-square  stable  sys- 
tems. This  is  clearly  not  the  case  for  uncertain  systems 
in  which  the  randomness  enters  multiplicatively  as  well  as 
additively  into  the  stochastic  system  in  a significant  way. 

Following  Magi  11  [62]  and  Ramsey  [63]  where  the 
discount  rate  5 = r-p  is  allowed  to  vary  from  -<«  to  +«>  with 
appropriate  economic  interpretations,  we  shall  now  consider 
the  discrete-time  problem  where  the  discount  factor  a can 
take  on  values  1 <a  < <*.  We  can  argue  heurist ical ly  that  in 
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order  for  the  cost  functional  Eq.  (2.6.3)  to  remain  finite 
for  larger  N,  the  terms  in  the  cost  functional  must  decrease 
faster  than  the  growth  in  a1  factor.  Specifically,  we  have 
the  cost  functional 


0 1 2 3 4 5 

a 


Figure  2.10  Optimality  and  stability  regions  for  system 
equation  (2.2.1) 


■ 
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2 . 7 Control  of  Linear  Systems  With  Correlated  Multiplicative 
and  Additive  Noises 


The  results  we  have  obtained  for  the  purely  random 


(white)  parameter  stochastic  control  problem  can  be  extended 


to  allow  for  correlations  between  the  system  additive  noise 


C ( t ) and  random  parameters  a(t)  and  b(t).  We  define  the 


correlations  by 


E j(a( t ) - a(t))S(s){-  = IaC(t)  5(t,s) 
E { (b( t ) - b(t))as)|  » EbJ,(t)  fi(t.s) 


(2.7.1) 


(2.7.2) 


The  control  problem  is  to  minimize  the  average 


quadratic  cost  functional, 


C ,a,b 


i 9 «-j.  9 9 

T = E Qx^N)  + l Qx  (t)  + Ru  (t) 


(2.7.3) 


subject  to  the  same  dynamical  system  Eq . (2.2.1). 


x(t+l)  = a(t)  x(t)  + b(t)u(t)  + £(t) 


(2.7.4) 


We  have  that 


V(N)  = Q x ( N ) 


(2.7.5) 


V(N-l)  = E |(q  a2(N-l)  + Q)  x2(N-1)  + (q  b2(N-l )+  r)u2(N-1  ) 

+ 2Q(a(N-l)  b(N-l)  x(N-l)  + b(N-l)£(N-l))  u(N-l) 
+ 2Q  a(N-l)  S(N-l)  x(N-l)  |xN‘1|  +QE  (2.7.6) 


Now  the  noise  £(t)  is  correlated  with  a(t)  and  b(t) 


The  cost-to-go  is  minimized  when 


u(N-l)  = - G(N-l)  x(N-l)  - p(N-l) 


(2.7.7) 


I 
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G( N-l ) 


Q(Jab  + ab) 
R+Q(!:bb*b2) 


(2.7.8) 


p(N-l)  = 


R + Q(rbb-M,2) 


(2.7.9) 


Substituting  this  optimal  solution  into  Eq . (2.7.6),  we 
obtain  for  the  optimal  cost-to-go  that 


J* (x( N-l ) , N-l)  = Q + Q( a2  + £ ) 

v 7 aa 


R+Q(b2  + Ebb) 


x2( N-l ) 


2 ^ab  + ab^ 

+ 2 Q E - <T  ^ ^ I.r 

R + Q(Ebb  + b2)  br’ 


x(N-l)  + constants 

= K(N-l)  x2(N-l)  + 2 k(N-l)  x(N-l)  + const 


where 


A -o  Q<Zab  + ab) 

K(N-l)  ^ Q + Q(a2  + E ) ^ 

R + Q(Ebb  + b^) 


A 2 ^ ^ab  + a b^ 

k(N-l)  - Q E _ - qf ^ ^5-  E 

“s  n j.  r\r  v j-  — \ 


(2.7.10) 


(2.7.11) 


R + Q(Ebb  + b2)  bf; 


(2.7.12) 


Going  back  one  more  step  to  N-2,  we  see  that  the 


structure  of  the  minimization  problem  is  the  same.  By 
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indulation  on  t,  we  then  obtain  the  following  result. 

Theorem  2.7 

Under  the  assumptions  in  Section  2.2,  but  allowing  £(t)  to 
be  correlated  with  both  a(t)  and  b(t),  the  solution  to  the 

optimal  control  problem  specified  by  Eqs.  (2.7.3)  and  (2.7.4) 

# 

exists  and  is  of  the  form 


u(  t)  = - G(t)  x(t)  - p(t) 

K(t+1)(E,  + a b) 

G(t)  = =5- 

R + K(t+1)(  £bb  + b ; 

b k(t+l)  + K( t+1 ) Ewc. 

p(t> =g£ 

R + K(t+l)(Ibb  + bz) 


K(t) 


Q + ( a2  + ^aa)  K( t+1 ) - 


K2(t+l)(Eab  + a b)2 
R + K(t+l)(Ebb  + b2) 


(2.7.13) 

(2.7.14) 

(2.7.15) 


(2.7.16) 


k(t)  = (a  - b G(t))  k(t+l)  + K(t+1)  (e  - G(t)  ) (2.7.17) 

with  the  boundary  conditions 


K(N)  = Q 
k(N)  = 0 


(2.7.18) 


The  optimal  policy  is  seen  to  consist  of  a feedback 
component  G(t),  together  with  a fixed  component  p(t).  It  is 
interesting  to  note  that  the  expression  for  G(t)  is  identical 
to  that  given  in  Section  2.3,  Eq . (2.3.12),  so  that  feedback 
regulation  of  the  state  is  independent  of  any  correlation 
between  the  additive  and  multiplicative  noise.  The  optimal 
feedback  control  law  is  still  linear  in  the  state.  On  the 


other  hand,  the  correction  term  p(t)  depends  crucially 
on  the  cross-covariances;  if  they  are  zero  this  term 
vanishes  and  leaves  us  with  the  feedback  component  alone 
and  reduces  to  the  results  given  in  Section  2.3. 

2 . 8 Conclusions 

This  chapter  shows  that  the  optimal  control  of  dy- 
namic systems  with  known  structure,  but  with  randomly  vary- 
ing parameters  (modeled  as  white  noise)  has  some  limitations. 
In  particular,  by  means  of  a simple  scalar  linear  - quadratic 
control  problem,  it  is  shown  in  Section  2.4  that  the  infinite 
horizon  solution  does  not  exist  if  the  parameter  uncertainty 
exceeds  a certain  quantifiable  threshold.  We  call  this  the 
Uncertainty  Threshold  Principle.  This  result  has  major  engi- 
neering implications  in  the  modeling  accuracy  required  in 
terms  of  the  variance  of  the  parameters  of  a dynamical  system 
before  any  stochastic  optimal  control  scheme  makes  sense. 

In  Sections  2.5  and  2.6,  it  is  demonstrated  that 
the  uncertainty  threshold  parameter  is  uniquely  characterized 
by  the  stochastic  system  itself  and  is  independent  of  the  per- 
formance criterion  used.  Optimal  controls  may  still  be  de- 
fined. due  to  the  inclusion  of  a discount  factor  in  the  per- 
formance index,  in  region  where  the  closed-loop  system  is 
unstable  in  a mean-square  sense.  The  engineering  implication 
is  that  a stochastic  stability  analysis  should  be  carried 
independent  of  the  stochastic  optimization  results.  In  most 
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stochastic  optimization  problems  solved  to-date  optimality 
and  stability  are  not  in  conflict;  optimal  controls  result 
in  stable  systems.  This  is  clearly  not  the  case  for  systems 
in  which  the  randomness  enters  multiplicatively  as  well  as 


additively . 
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CHAPTER  3 

OPTIMAL  LINEAR  ESTIMATION  OF  STOCHASTIC  SYSTEMS 
WITH  RANDOM  PARAMETERS 

3 . 1 Introduct ion 

In  Chapter  2 we  have  considered  the  optimal  stochastic 
control  of  a scalar  linear  stochastic  dynamical  system  with 
purely  random  parameters.  We  would  like  to  extend  the  analy- 
sis to  scalar  systems  with  noisy  measurements.  Before  doing 
that  we  will  examine  the  estimation  problem. 

It  is  well-known  that  for  the  standard  linear- 
quadratic-Gaussian  problem,  the  optimal  stochastic  control 
problem  separates  into  the  optimal  deterministic  control 
problem  and  optimal  estimation  problem  with  no  control.  That 
the  two  optimization  problems  are  not  completely  unrelated  is 
embodied  in  the  Duality  Theorem  which  says  that  one  problem 
is  the  dual  of  the  other.  We  will  show  that  the  optimal 
linear  estimation  results  are  not  completely  the  formal  dual 
of  the  optimal  control  problem.  For  the  optimal  stochastic 
control  derived  in  Chapter  2 to  be  truly  optimal,  the  optimal 
estimation  algorithm  derived  in  this  chapter  will  be  only 
optimal  in  the  class  of  linear  estimators.  The  technical 
assumptions  we  make  to  derive  the  linear  unbiased  estimators 
have  excluded  the  filter  from  being  the  truly  optimal  esti- 
mator. We  present  the  results  for  the  linear  minimum  variance 
filter  since  the  optimal  filter  would  have  to  be  nonlinear  and 
infinite  dimensional. 


We  will  state  the  problem  of  state  estimation  with 
purely  random  parameters  in  the  next  section.  The  mathematical 
model  developed  in  here  can  be  related  to  the  state-dependent 
and  control-dependent  noise  models.  In  Section  3,  we  derive 
the  optimal  linear  unbiased  estimator  in  the  minimum  variance 
sense.  The  estimator  is  to  operate  in  the  open-loop  sense. 

We  will  consider  feedback  control  in  the  next  chapter.  In 
Section  4,  the  asymptotic  behavior  of  the  linear  unbiased 
filter  is  examined,  first  for  the  case  where  the  random 
parameters  are  all  mutually  uncorrelated  at  all  times  and 
next  for  the  case  where  the  random  parameters  may  be  corre- 
lated at  each  instant  of  time  with  each  other.  A stability 
analysis  for  the  stochastic  estimation  problem  in  which  the 
purely  random  (white)  parameters  are  correlated  has  not  been 
found  in  the  literature.  We  note  that  the  results  in  this 
chapter  were  obtained  before  the  related  references  [64]  and 
[65]  were  found. 

Linear  optimal  filtering  for  a continuous-time 
linear  dynamical  system,  in  which  the  process  and  observation 
have  state-dependent  noise  was  considered  in  [66] . For  the 
time-invariant  problems,  it  was  shown  that  the  second  moment 
of  the  state  must  be  asymptotically  stable  for  the  uniqueness 
of  the  filtering  solution.  Necessary  and  sufficient  condi- 
tions for  the  second  moment  to  be  asympot ical ly  stable  is 
given  in  [67].  The  discrete-time  filtering  problem  was  con- 
sidered in  [68]  for  the  case  where  only  the  measurement 
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equation  contains  state-dependent  noise  and  no  input  is 
applied . 


3 . 2 Problem  Statement 

Suppose  that  the  scalar  linear  stochastic  dynamical 
system  is  described  by  the  difference  equation 

x(t+l)  = a(t)  x(t)  +b(t)u(t)  +£(t)  (3.2.1) 

We  include  the  second  term  in  the  estimation  problem  since 
this  will  be  of  importance  in  the  case  to  be  discussed  when 
a(t)  and  b(t)  are  correlated  random  parameters.  More  impor- 
tantly, this  just  represents  the  open-loop  optimal  estimation. 
But  when  we  allow  u(t)  to  be  a function  of  the  measurement, 
then  the  control  system  is  closed-loop. 

Let  us  assume  that  the  measurement  equation  is  given 


by 


z(t)  = c(t)  x(t)  + 0(t) 

Assume  that  the  initial  state  x(0)  is  a random  variable 
given  a priori  statistics. 

e{x(0)}.;0  , e{(x(0>  -i0)2}  - Ix0 


(3.2.2) 
, with 


(3.2.3) 


The  initial  state  variable  is  assumed  to  be  uncorrelated  with 
any  other  random  variables  in  the  system.  The  input  u(t)  is 
assumed  to  be  a deterministic  quantity  in  the  estimation 
problem . 

The  additive  noises  £(t)  and  9(t)  are  assumed  to  be 
zero-mean  Gaussian  white  noises,  uncorrelated  with  each  other 
at  all  times,  and  to  have  known  a priori  statistics. 
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K {f!(  t ) r,(T)J  = H (t)  5(t,T)  ( 3 . 2 . ‘1 ) 

E |o ( t ) 0 ( r ) | »0(t)6(t,T)  (3.2.5) 

What  distinguishes  our  problem  from  the  standard 
linear  Gaussian  estimation  problem  is  that  the  parameters 
a(t)  and  b(t)  and  c(t)  are  assumed  to  be  random  parameters 
uncorrelated  in  time,  with  known  means  and  covariances. 

E j(a(t)  - a(t))  (a(t)  - a(i))j 

= Zaa(t)  6 ( t , r ) (3.2.0) 

E j(b(  t ) - b(t))  (b(x)  -b(T))} 

= Ebb(t)  6( t , T ) (3.2.7) 

E |(c  ( t ) - C ( t ))  (c  ( T ) -C(T))| 

= E (t)  5(t ,t)  (3.2.8) 

cc 

The  random  parameters  may  be  correlated  with  each  other  at 
each  instant  of  time,  so  that 

F.j(a(t)  - a(t))  (b(T)  - b(t))}  - Eftb(  t)  6(  t , t ) (3.2.9) 

Moreover  the  random  parameter  c(t)  may  be  correlated  with 
a(t)  and  b(t),  that  is 

E{(a(t)  - a(t))  (c(t)  -c(x))|  = EftC(  x ) 6 ( t , r ) (3.2.10) 

F.|(b(t)  - b(t))  (c(i)  -c(t))}  = Ebc(x)  6(t,T)  (3.2.11) 

assume  that  the  random  parameters  are  independent  of  the 


E |a(  t )|  = a(t)  , 
E|b(t)}  = b(t)  , 
e|c( t ) | = c(t)  , 


lilditive  white  noise  £.  ( t ) in  the  system  dynamics  and  0 ( t ) 
if.  measurement.  Note  that  in  Eq . ( 3 . 2 . 1 ) if  b(t)  is 


uncorrelated  with  a(t)  and  c(t)  for  all  t,  then  the  second 
product  term  essentially  affects  the  system  dynamics  as  an 
additional  driving  noise  that  can  be  combined  with  £( t)  in 
the  solution  to  the  filtering  problem  as  we  will  see. 

The  stochastic  linear  system  given  by  the  difference 
Eq . (3.2.1)  is  a Gaussian-Markov  process,  since  the  random 
parameters  are  assumed  to  be  Gaussian  white.  However,  the 
a posteriori  conditional  density  function  is  non-Gaussian 
due  to  the  random  system  parameter  a(t).  The  conditional 
probability  density  cannot  in  general  be  computed  exactly 
since  an  infinite  number  of  conditional  moments  are  needed. 

In  practice  then,  one  would  approximate  the  nonlinear  filter 
or  fix  a priori  the  structure  of  the  estimator  to  be  linear 
and  unbiased.  We  will  constrain  the  filter  in  this  chapter 

i 

to  be  linear  in  both  the  state  and  the  measurements,  although  j 

I 

it  can  be  shown  that  the  linear  filter  is  not  optimal  in  the 

i 

class  of  all  possible  filters  for  the  system  Eqs.  (3.2.1) 
and  (3.2.2)  (65] . 

I 

We  shall  denote  the  post  measurements  by  r 

| i 

z1  £ (z(l) ,z(2) z( t ) } ‘ ; 

; ! 
i 
l 

» 

3 . 3 Derivation  of  the  Linear  Minimum  Variance  Filter 

We  consider  now  the  Kalman-type  linear  filter  of 
the  following  recursive  form  [69] , the  conditional  mean  being 


given  by 
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x(  t+1 1 1 ) = F(t)  x ( 1 1 1 — 1 ) + G(t)  u ( t ) + iHt)  z(t)  (3.8.1) 
Substitute  Eq  . (3.2.2)  into  this  equation,  we  get 
x(t+l|t)  = F(t)  x(t|t-l)  + G(t)  u(t)  + iMt)  c(t)  x(t) 


+ 4-(t)  6(t) 


(3.3.2) 


Subtracting  this  equation  from  Eq . (3.2.1)  we  get 


the  estimation  error 


x(t+l)  - x(t+l|t)  = F(t)  ^x(t)  - x(t|t-l)) 

+ j^a(t)  - 4>(t)  c(t)  -F(t)Jx(t) 

+ (b(t)  -G(t))u(t)  - iKt)  0(t)  + C(t) 


(3.3.3) 


We  require  that  the  estimate  be  unbiased,  so  that 


E{x(t+1)  - x(t+l|t)l  =0  Vt 


(3.3.4) 


Taking  the  expectation  of  Eq . (3.3.3)  we  obtain  that 


F(t)  = a(t)  - iHt)  c(t) 
G(t)  = b(t) 


(3.3.5) 

(3.3.6) 


The  estimation  error  then  satisfies  the  recursive  equation 
e(  t+1 1 1 ) = (a(t)-iKt)c(t))e(t|t-l)+(b(t)  -b(t))  u(t) 

+ f(a(t)  -a(t))  + <Kt)(c(t)  - c(t))  x ( t ) 


- iKt)  0(t)  + C(t) 
and  the  state  estimate  evolves  as 


(3.3.7) 


x(t  + l|t)  = a(t)  - <{>(t)c(t)  x(  1 1 1-1)  + b(  t ) u(  t ) + 4»(  t ) z(  t ) 
_ 


Define  the  conditional  error  covariance  to  be 


rxx(t+l|t)  ' l?{e2(t  + l|t)|zt} 


(3.3.8) 


(3.3.9) 
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It  is  evident  from  Eq . (3.3,7)  that  the  predicted  error  co- 
variance  E (t+l|t)  will  involve  terms  requiring  the  com- 
putation of  the  second  conditional  moment  of  the  state. 

We  note  here  that  the  measurement  update  is  unbiased, 
since  i f we  define 


a(t)x(t|t)  = |a(t)  - iKt)  c(t)lx(t|t-l)  + \|i  ( t ) z(  t ) 

L (3.3.10) 

then 

a( t )E jx( t ) -x(t|t)|ztj  = a(t)Ejx(t)  - x(  1 1 1-1 ) | zt_1  j 

+ i|/(t)  E jc(t)  x(t)  - c(t)  x(  1 1 1-1)  | zt-1| 

=0  (3.3.11) 

Now,  the  estimation  error  covariance  is  given  by 

Exx(t+l|t)  = Pa2 ( t ) +ip2(t)  c2(t)  - 2l(t)  c(t)  *(t)J  Ixx(t|t-1) 

+ [laa(t)  + 4»2(  t ) Ecc(t)  - 2ERC(t)  <Kt)jE{x2(t)} 

+ Ebb(t)u2(t)  + E ( t ) + <»2(t)  0(t) 

+ 2 I"  Iafe(t)  - *(t)  Ebc(t)1u(t)  E(x(  t)  } 

L J (3.3.12) 


where  the  second  moment  of  the  state  is  given  by 

E jx2( t+l)|  = (a2(t)  + Eaa(t))  E|x2(t)J  + (b2(t)  + Ebb(t))  u2(t) 

+ E(t)  + 2(a(t)  b(t)  +£ab(t))  u(t)  E( x( t ) } 

(3.3.13) 


If  we  define, 

X(t+1)  - E{x2( t+1 ) } 


We  can  write  Eq . (3.3.13)  as 

X(t+V)  = ( a2  ( t ) + Zaa(t>)  X(t)  +(b2(t)  + Zbb(t))  u2(t)  + 5(t) 

+ 2(a(t)b(t)  +^ab(t))  u(t)  x ( 1 1 1 ) (3.3.14) 

and  the  mean  is  given  by  definition 

a(t)x(t|t)  - ^a(t)  -iKt)  c(t)Jx(t|t-l)  + i|»(t)  z(t)  (3.3.15) 


with  initial  conditions. 

x( 0 | -1 ) = E{x( 0) } = xQ 


jxx(°i-i)  - i 


x0 


X(0)  * £x0  + 4 


(3.3.16) 

(3.3.17) 

(3.3.18) 


We  now  want  to  determine  the  filter  gain  iMt)  such 
that  the  error  covariance  in  Eq . (3.3.12)  is  minimized.  We 
have  a deterministic  optimization  problem.  Taking  the  deriva- 
tive with  respect  to  ijj(t)  and  setting  the  necessary  condition 
to  zero,  we  get 


* a(t)c(t)E  (t 1 1 — 1 ) + E (t)X(t)  + E (t)u(t) 

(t)  = ^ ^ 2^ 

c 1 1 )E  ( t 1 1-1 ) + 0 ( t ) + E (t)X(t) 

A A Lvy 

(3.3.19) 

Substituting  this  result  into  Eq . (3.3.12)  the 
minimum  estimation  error  covariance  is 

E ( t+1 1 t ) = a2(t)E  ( t | t-1 ) + E..(t)u2(t)  + E (t)X(t) 

X A XX  l ) D cl  tl 


+ 2Eab(t)u(t)  E { x( t ) } + H ( t ) 


I 
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It  can  be  shown  that  the  optimal  filter  gain  ip  (t)  in  Eq. 
(3.3.19)  minimizes  the  error  covariance  at  any  time.  The 
filter  gains  may  be  pre-computed  since  they  are  independent 
of  the  measurement. 


3.4  Linear  Filter  With  Uncorrelated  Parameters 


In  this  section  we  will  present  the  results  on  the 
asymptotic  behavior  of  the  linear  minimum  variance  filter 


E , I < 


when  the  random  parameters  are  mutually  independent  at  all 
times.  This  assumption  is  made  to  simplify  the  algebra  and 
notations,  but  do  not  change  the  conclusions. 


The  optimal  linear  filter  is  given  by  the  recursive 


equations. 


Prediction:  (Update  Cycle) 


x(t+l|t)  = (a(  t ) - <K  t)  c(  t ))x(  1 1 1-1)  + b(t)u(t) 


+ <Kt)  z(t) 


(3.4.1) 


The  estimate  has  to  be  computed  on-line  since  it  is  dependent 
on  the  current  observations.  The  filter  gain  computation  is 
given  by 


'J'(t)  = 


a(t)  c(t)  EXJC(t|t-l) 
c^m^tlt-n  + Ecc(t)X(t)  + 0(t) 


(3.4.2) 


The  estimation  error  covariance  is  given  by 


I (t+l|t)  = a2(t)E  ( 1 1 1-1 ) - a(t)c(t)E  (t|t-l)  <p(t) 


+ Zbb(t)u  (t)  + Eaa(t)X(t)  + 5(t) 
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= I2(t)Exx(t|t-l) 

- i|'2(  t ) [c2(t)  Zxx(t|t-l)+ZcC(t)X(t)+0(t)] 

+ Ibb(t)u2(t)  + Zaa(t)X(t)  + H ( t ) (3.4.3) 

and  can  be  computed  off-line. 

We  can  also  rewrite  the  filtering  equations  in  terms 
of  the  mixed  equations  as  follows. 

Filtering:  (Measurement  Update  Cycle) 

From  Eq.  (3.4.1)  we  have 

x(t|t)  = (l-H(t)c(t))  x ( 1 1 1 — 1 ) + H(t)  z(t)  (3.4.4) 

We  redefine  the  filter  gain  in  terms  of  H(t),  the 
standard  filter  gain,  using 

i|)(t)  = a(t)H(t)  (3.4.5) 

From  Eq . (3.4.2),  we  write  the  update  estimation  error  co- 
variance  as 

yt|t)  = (l-H(t)c(t))  Ixx(t|t-1)  (3.4.6) 

It  can  be  seen  that  the  estimation  error  covariance  depend 
on  the  input  u(t-l). 

It  can  be  shown  that  for  the  uncorrelated  parameter 
case  that  [64] 

E{(x(t)  - x(  1 1 1 ) ) x(  1 1 1 ) } = 0 Vt  >0 

if  E{ ( x( 0)  - x( 0 | 0) )x( 0 | 0) } = 0.  The  estimation  error  is  thus 
orthogonal  to  the  state  estimate. 

The  optimal  linear  filter  given  by  Eqs.  (3.4.1)  to 


(3.4.6)  resembles  the  standard  Kalman  filter  for  linear 


Figure  3.1  Linear  minimum  variance  unbiased  estimator  for  stochastic  system 
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Gaussian  estimation  problems.  However,  the  computation  of 
the  second  moment  of  the  state  X(t)  is  an  added  term  for 


the  random  parameter  problem.  The  positive  semidefiniteness 
of  the  covariance  of  c(t)  adds  "convexity"  to  the  filtering 


problem  and  makes  the  solution  more  well-behaved  numerically. 


The  random  parameter  covariances  incorporates  equivalent 
driving  noises  and  measurement  noises  in  a natural  manner 
into  the  problem. 

In  the  case  where  the  random  parameters  have  sta- 
tionary statistics  as  well  as  £(t)  and  0(t),  stability  con- 
ditions for  the  minimum  variance  filter  can  be  given.  The 
nonlinear  difference  Eq.  (3.4.3)  is  then 

E (t+l|t)  = a2E  (t|t-l)  + E X(t)  + E.,u2  + E 
xx  1 xx  1 aa  bb 

- H2(t)[c2Exx(t|t-l)  + EccX(t)  + o]  (3.4.7) 

where  u(t)  is  assumed  to  be  constant  also.  The  case  of 

u( t )= constant  is  effectively  to  increase  the  additive  noise 

S(t)  in  the  system  by  a time-varying  additive  noise  b(t)u 
_ 2 

of  mean  b(t)u  and  covariance  u Efeb(t).  In  the  steady-state 

the  state  estimation  error  covariance  is,  therefore,  in- 
2 

creased  due  to  u E..(t). 

bb 

The  boundedness  of  the  predicted  error  covariance 

depends  on  the  boundedness  of  the  second  state  moment  X(t) 

in  Eq.  (3.4.7).  From  Eq . (3.3.14),  the  second  moment  is 

_2 

asymptotically  mean-square  stable  if  and  only  if  (E  +a  )<1. 

atl 

If  this  inequality  is  satisfied  then  E{x(t)}  is  also 
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asymptotically  mean-square  stable.  For  stationary  systems, 
the  asymptotic  stability  of  the  second  moment  of  the  state 


X(t)  is  a sufficient  condition  for  the  stability  of  the 

—o 

estimator.  If  (E  +a  )<1  then  the  predicted  error  covariance 

cl  cl 

will  be  bounded.  The  filter  is  effectively  a Kalman  filter 
with  time-varying  noise  statistics  given  by  E X(t). 

Ea 

We  summarize  the  results  above  in  the  following 

theorem 
Theorem  3.1 

The  solution  to  the  Riccati-like  Eq . (3.4.7) 

a2  c2  E2  ( 1 1 1-1 ) 

Exx(t+l|t)  = a2Exx(t|t-l)  - 3 x:k 


C Exx(t|t-D  + © + ^ccX(t) 


+ E + E X(t)  + E,  , u 
aa  bb 


(3.4.8) 


exists  and  is  unique  if  the  condition 

E 

aa 


+ a2  < 1 


is  satisfied  for  u( t )= constant . 

The  steady-state  E satisfies  the  algebraic  equation 

XX 


E = a2  E x 

xx  xx  —2 


-2  -2  „2 
a c E 

XX 


c E + E X + 0 
xx  cc 


+ E X + E,  . u + 5 
aa  bb 


(3.4.9) 


—2 

For  (Z  +a  )>lf  the  predicted  error  covariance 
aa 

diverges,  but  the  filter  gain  computation  is  still  given  by 


H = 


c2  + E 


(3.4.10) 


cc 


since 


X i £xx  ♦ ECS  ) 


(3.4.11) 
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! 


■ 

t 


In  the  special  case  where  the  parameter  a(t)  is 
known,  then  the  necessary  and  sufficient  condition  for  the 
asymptotic  stability  of  the  second  moment  of  the  state  is 

I a | <1 . 

An  approximate  analysis  of  Eq . (3.4.7)  shows  that 


for  Exx(t+l|t)  large 


Exx<t+1lt>  * a2  Exx(tlt-1>  ^aa^xx^l'-1)- 


= m Zxx(t|t-1) 


a2c2Z  ( t | t-1 ) 
xx  1 

c2  + E 

cc 

(3.4.12) 


where 


A — 2 , v 
m = a + I 


aa 


-2  -2 
a c 

—2  r 
c + E 

cc 


(3.4.13) 


then  m>l.  However,  this  inequality  is  weaker  than  the 
threshold  condition  given  in  Theorem  3.1  and  would  include 
points  which  did  not  give  rise  to  mean-square  stable  filters 
This  simple  analysis  shows  that  the  expression  in  (3.4.13) 
which  can  be  obtained  by  equating  b with  c is  only  a suffi- 
cient stability  condition  in  the  filtering  problem. 

3 . 5 Mutually  Correlated  Random  Parameters 

In  this  section  we  will  consider  the  asymptotic 
behavior  of  Eq . (3.3.20).  When  the  random  parameters  a(t), 
b(t),  and  c(t)  may  be  mutually  correlated  at  each  instant 
of  time.  For  the  scalar  stochastic  system  with  wide-sense 
stationary  statistics, 
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Ex*<t+1lt) 


a\x(tlt-1) 


[a  c Exx(t|t-1)  + ZacX(t)  + Ebcu  E{x(t)}]2 
c2Ixx(tlt-1)  + lccMt)  + G 

+ 5 + E X(t)  + 2E  . u E{x( t ) } + E,  , u2  (3.5.1) 
aa  ab  bb 


In  case  the  random  parameter  b(t)  is  not  correlated 
with  any  other  white  noise  parameter,  we  have  a simplifica- 
tion. The  predicted  error  covariance  is  given  by 


[a  c E ( 1 1 1 — 1 ) + E X(t)J 


E ( t+i  1 1 ) = a2E  (t[t-l)  - : ac_l_ld 

XX  XX  c2  Exx(t|t-1)  + EccX(t)  + 0 


+ E X(t)  + J.  . u + 
aa  bb 


(3.5.2) 


We  recall  from  the  asymptotic  stability  analysis  of 

Section  3.4,  that  the  solution  to  the  above  Riccati-like 

equation  will  remain  bounded  as  t-*-°°  if  the  second  moment  of 

x(t)  is  asymptotically  stable.  A sufficient  condition  for 

—o 

X(t)  to  be  asymptotically  stable  is  that  (E  + a )<1. 

aa 

For  t-*00,  and  if  the  solution  to  the  Eq.  (3.5.2) 
diverges  then  we  can  write 


Exx(t+llt)  5 a Ixx(t,t_1)  ‘ 


(ac  "•  £ac>' 
c2  + E 


E ( 1 1 1-1 ) 
xx  1 


+ E E (t|t-l) 
aa  xx  1 


(3.5.3) 


since 


X(t)  = E ( t | t-1 ) + E{x  ( t | t-1) > 

XX 


(3.5.4) 


(3.5.6) 


However,  this  is  only  a sufficient  condition  for  Eq . (3.5.2) 

J 

to  diverge. 

The  case  in  which  the  random  parameter  b(t)  is  cor- 
related with  a(t)  but  not  with  c(t),  does  not  change  the 
asymptotic  stability  condition  since  | a | > 1 implies  E +a  >1. 

cicl 

The  case  in  which  b(t)  is  correlated  with  both  a(t)  and  c(t) 
as  given  in  Eq . (3.5.1)  will  also  not  change  the  asymptotic 
stability  results  given  in  Eq . (3.5.5). 

■ 1 

If  u(t)=0,  then  the  deterministic  input  is  effec- 
tively eliminated  from  the  plant  Eq . (3.2.1).  This  allows 

J I 

us  to  deal  with  only  the  pure  estimation  problem.  It  does 
not  simplify  the  problem  any  greater  than  if  we  assumed  that 

* 

the  random  parameter  b(t)  is  uncorrelated  with  a(t)  and  c(t),  ; 1 

! | 

since  then  the  input  u(t)  multiplied  by  b(t)  affects  Eq . j • 

1 

(2.2.1)  as  an  additional  driving  noise.  The  analysis  was 

• | 

presented  in  Section  3.4.  The  effective  additive  noise  co- 

2 

variance  is  increased  by  E ,,u  as  in  Eq . ( 3 . 4 . 7) . 1 


■ u 
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3 . 6 Conclusions 

This  chapter  considered  the  linear  minimum-variance 
estimation  for  stochastic  systems  with  purely  random  (white) 
parameters.  Because  of  the  random  parameters  multiplying 
the  state,  the  conditional  density  is  non-Gaussian  even  if 
all  the  random  processes  are  Gaussian.  We  extend  previous 
results  on  the  linear  minimum  variance  estimation  for  such 
a class  of  stochastic  systems  to  include  state-  and  control- 
dependent  noises  in  both  the  plant  and  measurement  equations. 

The  linear  filter  determined  in  this  chapter  is 
similar  in  form  to  the  Kalman  filter,  except  that  the  second 
moment  of  the  state  must  be  propagated.  Conditions  for  sta- 
bility of  the  linear  minimum-variance  estimator  are  presented. 

We  allow  for  the  correlations  of  the  uncertain  parameters  in 

the  general  estimation  problem.  For  the  stochastic  system  , 

with  purely  random  (white)  parameters,  we  have  shown  that 

the  solution  to  the  Riccati-like  forward  difference  equation 

may  become  divergent  as  t-*0®  for  some  quantifiable  threshold 

depending  on  the  means  and  variances  of  the  randomly  varying 

parameters.  This  result  is  analogous  to  the  linear  quadratic 

control  problem,  but  does  not  arise  in  the  standard  linear- 

gaussian  estimation  problem. 


a 


CHAPTER  4 


OPTIMUM  CONTROL  OF  RANDOM  PARAMETER  SYSTEMS 
WITH  NOISY  MEASUREMENTS 

4 . 1 Introduction 

In  Chapter  2,  optimum  control  of  random  parameter 
system  with  noise-free  state  measurements  has  been  discussed. 
In  this  chapter  we  will  be  concerned  with  the  optimum  control 
laws  for  systems  subject  to  random  parameters  and  with  noisy 
observations.  Just  as  in  the  optimum  control  of  systems  with 
deterministic  parameters,  the  determination  of  random  param- 
eter control  systems  involves  two  problems  (1)  the  problem  of 
optimum  estimation  and  (2)  the  problem  of  optimum  control. 

In  the  standard  deterministic  linear-quadratic-Gaussian  (LQG) 
problem  the  separation  theorem  holds  [3] , [4] . A stronger 

result  stated  as  the  Certainty-Equivalence  Principle  applies 
to  the  LQG  stochastic  control  problem.  As  we  shall  see  in 
the  random  parameter  stochastic  control  problem,  the  optimum 
solution  does  not  separate  in  the  sense  that  the  filter  gains 
are  not  independent  of  the  control  computation.  In  the  white 
noise  parameter  control  problem  there  is  no  learning  in  the 
control  law.  The  covariances  for  the  random  parameters  cannot 
be  reduced  below  their  a priori  values.  From  Chapter  2,  it 
follows  that  the  Centainty-Equivalence  Principle  does  not 
apply  in  the  random  parameter  problem. 

The  optimum  control  strategy  for  the  random  parameter 
system  has  to  perform  simultaneously  the  estimation  and  con- 
trol of  the  state  while  minimizing  the  expected  value  of  some 


m 
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scalar  real-valued  cost  functional.  In  this  sense,  the  con- 
trol law  derived  is  adaptive.  It  must  adapt  to  the  level  of 
uncertainty  in  the  parameters  and  the  state,  yet  it  must 
regulate  the  control  system.  This  is  an  example  of  non- 
learning  adaptive  control.  If  we  accept  the  definition  of 
dual  control  as  given  in  [8],  [9],  and  170]  our  stochastic 
control  law  is  non  dual,  since  our  knowledge  of  the  system 
model  does  not  increase. 

In  Section  4.2  we  will  state  precisely  the  optimal 
control  problem.  In  Section  4.3,  we  investigate  the  optimum 
solution  to  the  control  problem  formulated  in  Section  4.2  in 
terms  of  the  conditional  means  and  covariances  of  the  state. 
The  optimum  filter  is,  in  general,  nonlinear  and  not  practi- 
cal to  implement.  Hence,  we  proceed  to  determine  the  sub- 
optimal  solution  in  the  class  of  linear  estimators  and  linear 
controllers.  In  Section  4.4  we  reformulate  the  stochastic 
control  problem  as  a deterministic  optimum  control  problem. 
Two  solution  methods  are  possible  - Matrix  Minimum  Principle 
[71]  and  non-stat ionary  dynamic  programming.  The  structure 
of  the  optimum  controller  is  given  in  Section  4.5.  In  Sec- 
tion 4.6,  we  discuss  in  more  detail  the  qualitative  proper- 
ties of  the  optimal  control  law  for  the  fixed  structure  feed- 
back control  system.  In  Section  4.7,  we  examine  the  asymp- 
totic behavior  of  the  stationary  control  for  stochastic  sys- 
tems with  stationary  statistics  and  constant  weights  in  the 
cost  functional.  Analogous  to  Section  2.5,  in  Section  4.8, 


t 


[ 


I 
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we  analyze  the  stability  of  the  stochastic  system  under 
output  feedback.  We  are  interested  in  the  question  of  the 
existence  of  optimum  controls  in  steady-state  for  finite  cost. 

4 . 2 Problem  Statement 

Consider  a linear  stochastic  system  with  purely  ran- 
dom parameters  characterized  by  the  scalar  difference  Eq . 
(2.2.1) 

x(t  + l)  = a(t)x(t)  +b(t)u(t)  +£(t)  (4.2.1) 

The  measurement  equation  is  also  scalar 

z(t)  = c(t)x(t)  + 0(t)  (4.2.2) 

where  c(t)  and  0(t)  are  mutually  independent  zero-mean  Gaussian 
white  noises  with  known  statistics, 

E(£(  t ) S(t)}  = "(t)  <5  ( t , t ) (4.2.3) 

E{0(x)O(t)}=  0(t)  6(t,x)  (4.2.4) 

The  initial  state  x(0)  has  known  a priori  statistics 

E{x( 0) } = x( 0)  = x( 0 | -1 ) (4.2.5) 

E { ( x( 0)  - x( 0) )21  = ZxQ  (4.2.6) 

The  time  varying  system  parameters  a(t)  and  b(t)  are 
white  processes,  uncorrelated  in  time,  with  known  statistics, 
E{a(  t ) ) = a(t)  . E(  ( a(  t ) -a(t))(a(x)  -a(r))} 

" Eaa(t)  6(tlT)  (42-7) 

E(b(  t ) } = b(t)  , E{(b(t)  - b(t))(b(T)  -b(t))} 


< I 


t 


i 


( 


i 


» 


= Ebb(t)  6(t.T) 


(4.2.8) 
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The  independent  random  parameters  may  be  correlated  with 
each  other  at  time  t, 

E{(a(t)  - a(t))(b(x)  -b(x))}  = ^ab(t)6(t,T)  (4.2.9) 

The  coefficient  c(t)  is  assumed  to  be  white,  uncorrelated  in 
time,  with  known  statistics, 

E{ c( t ) } = c(t)  , E{ ( c( t ) - c( t ) )( c( t ) - c( x ) ) } 

- Ecc(t)  6 ( t , t ) (4.2.10) 

Finally,  it  is  assumed  that  the  output  coefficient  c(t)  is 
uncorrelated  with  the  system  parameters  a(t)  and  b(t)  for 
all  time  indexes.  The  white  random  coefficients  a(t)  and  b(t) 
are  uncorrelated  with  the  additive  noise  £(t)  and  c(t)  is  un- 
correlated with  the  additive  noise  0(t)  for  all  time  indexes. 

The  optimum  stochastic  control  problem  is  to  deter- 
mine a non-anticipative  closed-loop  control  law  based  on  the 
past  and  current  measurements  and  past  controls  that  minimizes 
the  expected  value  of  a quadratic  function  of  the  state  and 
control  variables, 

( 9 N-l 

J = E <Fx  (N)  + l Q(t)  x 
( t=0 

subject  to  the  dynamics  of  Eq.  (4.2.1)  and  measurement  func- 
tion Eq . (4.2.2).  The  weightings  Q(t)  and  F are  assumed  to 
be  positive  semi-definite  and  R(t)  is  assumed  to  be  positive 
def inite. 

The  admissible  controls  are  required  to  be  measur- 


'(t) +R(t) 


u2(  t 


»} 


(4.2.11) 


able  functions  of  the  current  and  past  measurements  to  assure 
that  they  are  a random  variable.  We  denote  the  entire 
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measurement  history  to  be  zt  = {z( 0) , z( 1 z( t ) } and  the 

entire  control  history  to  be  ut_*  = (u(0),u(l) u(t-l)}. 

We  seek  control  laws  of  the  type  u(t)  = y(t,x(t|t)),  ueU, 
where  x(t|t)  is  a sufficient  statistic  of  the  state.  The 
control  specified  has  perfect  recall  (memory)  and  a totally 
nested  information  structure. 

For  the  multistage  stochastic  control  problem,  we 

have  that 

J = E{L(u(t) , S(t)  , x(t))  + L(x(t+1))}  Sen  (4.2.12) 

Where  we  define  the  information  available  to  u(t)  at  t as 

z1  £ {u(  0) u(  t-1 ) , y(  1 ) y(  t ) } (4.2.13) 

then  the  Principle  of  Optimality  implies  that 

J*(zt)  = min  E{L(u( t ) , S(t),  x(t))  + J*(zt+1)  zt}  (4.2.14) 
u(t) 

We  have  examined  the  problem  where  z(t)  =x(t)  (perfect  ob- 
servation of  the  state)  in  Chapter  2.  When  the  measurement 
is  not  exact,  then  the  solution  of  Eq . (4.2.14)  requires  the 
knowledge  of  p(x(t)|zt).  The  assumption  of  perfect  memory 
renders  p(x(t)|zt)  a well-defined  probability  distribution 
function  and  permits  a recursive  computation  of  p(x( t+1) | zt+1) 
from  p(x(t)|z*)  by  a filtering  algorithm.  If  the  filtering 


algorithm  does  not  depend  on  the  control  functions  y(0), 

Y( 1) , . . • > Y( t ) then  the  Separation  Theorem  holds  for  the  dy- 
namic optimization  problem. 
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■1.3  Optimum  Solution  of  the  Stochastic  Control  Problem 

In  this  section,  we  investigate  the  stochastic  con- 
trol problem  via  the  method  of  dynamic  programming.  We 
derive  the  optimum  stochastic  control  law  using  the  Bellman's 
Principle  of  Optimality.  We  define  the  cost-to-go  at  t»N-l  , 

N-l 

given  measurements  /.  and  using  optimum  systems  control 
u(N-l)  by 

V(  N-l , x(  N-l  ) ) « min  KlFx2(N)  + Q( N-l ) x2( N-l ) 
u(N-l) 

+ R(N-l)  u2(N-1 ) | zN_1 } 

- min  K{x2(N-1)(F a2(N-l) + Q(N-l)) 
u(N-l) 

+ 2 Fa(N-l)  b(N-l)  x(N-l)  u(N-l) 

+ (Fb2(N-l)  + R(N-l)  u2(N-1)|zN_1) 

+ F S (4.3.1) 

since  C(N-l)  is  Independent  of  u(N-l)  and  x(N-l). 

If  we  let 

x(N-l | N-l ) ^ Etx(N-l) |zN-1}  (4.3.2) 

be  the  conditional  expectation  of  x(N-l)  given  the  Information 
statistic  and  similarly  let 

J;  ( N— 1 | N—  1 ) A E{(x(N-l)  - x(N-l  | N-l)  )2  | zN_1 } (4.3.3) 

be  the  conditional  covariance.  Assume  that  a(t)  and  b( t ) are 
independent  of  x( t ) , we  then  obtain 


I 
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V(x(N-l)  ,N-1)  = min  ] E { x2( N-l ) ( F( a2( N-l ) + E (N-l)) 
u(N-l)  < aa 

+ Q( N-l ) ) | zN-i } 

+ 2F(Eab(N-l)  + a(N-l)  b(N-l)) x(N-l|N-l) 

• u(N-l)  + ( F ( b 2 ( N - 1 ) + Ebb(N-l))  + R(N-l)) 

• u2(N-1 ) | + F H(N-l)  (4.3.4) 

Taking  the  derivative  of  this  expression  on  the  right 

hand  side  with  respect  to  u(N-l)  for  the  algebraic  minimiza- 
tion, we  get 

u*( N-l ) = - G(N-l)  x(N-l|N-l)  (4.3.5) 

F(E  (N-l)  + a( N-l ) b( N-l ) ) 

G(N-l)  = - (4.3.6) 

F(bz(N-l)  + Ebb(N-l))  + R(N-l) 

Substituting  these  results  into  the  expression  for 
the  cost-to-go,  we  get 

2 n 

V( x( N-l ) , N-l ) = E {x  (N-l ) ( F( a^N-l ) + E (N-l)) 

aa 

+ Q( N-l ) ) | z^”1 } 

F2(Eab(N-1)  + a(N-l)  b(N-1))2  i 

F(b2(N-l)  + Ebb(N-l))  + R(N-l)  ; 

• x2(N-l|N-l)  + F E(N-l)  - I 

' i 

= E{x2(N-l ) K(N-l) |zN_1  } j 

[f(E  , (N-1)+I(N-1)  b(N-l))]2 

+ - 35 — — E (N-l  | N-l ) 

F(Ebb(N-1)+b^(N-1))  + R(N-!)  xx 


+ F 5(N-1) 


(4.3.7) 


' 


I 

I 


II 
. < 
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where 

K(N-l)  = F( a2( N-l ) + E ( N-l  ) ) + Q 

Hfl 

[f(  a(N-l ) b(N-l ) + £ h(N-1))]2 

- — (4.3.8) 

F(b2(N-l)  + Ebb(N-l))  + R(N-l) 

An  alternative  form  for  the  cost-to-go  expression 
Eq.  (4.3.7)  is  given  by 

V(x(N-l|N-l),N-l)  = K(N-l)  x2(N-1) 

+ [f(I2(N-1)  + Eaa(N-l))  +Q(N-1)J 

• Exx(N-l|N-l)  + F E ( N-l ) (4.3.9) 

In  (37] , it  is  claimed  that  the  second  term  in  the 

cost-to-go  expression,  Eq . (4.3.7),  will  be  independent  of 

the  past  controls  if  the  estimation  error  has  a conditional 

N-l 

covariance  independent  of  x(N-l)  and  z . In  the  deter- 
ministic 1 inear-quadrat ic-gaussian  control  problem  it  can  be 
shown  that 

E{(x(t)  - x(t |t))2|zt)  0 < t < N 

are  independent  of  x(t)  and  (see  (3],  (4],  [72],  [73]) 
since  the  estimation  errors  e(t)  - x(t)-x(t|t)  can  be  shown 
to  be  independent  of  the  past  measurements  or  functions  of 
these  measurement.  Therefore,  the  estimation  errors  are 
independent  of  past  controls.  Only  the  first  term  in  the 
expectation  of  Eq . (4.3.7)  is  influenced  by  previous  control 
policies. 


L. 


F*~ 
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At  time  t=N-2,  we  have  then  the  cost-to-go 

V( N-2 , x( N-2 ) ) = min  E { V( N-l , x( N-l ) ) + Q(N-2 ) x2( N-2 ) 
u(N-2) 

+ R( N-2 ) u2(N-2) |zN~2} 

= min  E{K(N-1)  x2(N-1) + Q(N-2) x2(N-2) 


u( N-2 ) 


+ R(N-2)  u2(N-2) |zN'2}  (4.3.10) 


using  the  property  of  the  conditional  expectation 
E { E { • I z^~ 1 } I zN-2 1 = E{*|zN~2} 


(4.3.11) 


The  cost-to-go  expression  in  Eq . (4.3.10)  has  a form  exactly 
identical  to  Eq.  (4.3.1)  except  for  the  indexes.  The  in- 
ductive procedure  now  repeats. 

We  state  the  following  theorem  based  on  our  results, 


Theorem  4 . 1 

Given  the  stochastic  linear  dynamical  system  described  by 
Eqs . (4.2.1)  and  (4.2.2)  and  the  admissible  control  law  be- 
longing to  the  class  of  causal  inputs,  the  optimum  control 
law  that  minimizes  the  expected  value  of  the  cost  functional 


Eq.  (4.2.11)  is  given  by 

u*(t)  = - G(t)  x ( t | t ) 


G(t)  = 


K(t  + l)(a(t)  b(t)  + Eab(t) ) 
K(t+l)(b2(t)  + Ebb(t))  + R(t) 


(4.3.12) 


(4.3.13) 


K(t)  = K( t+1 ) ( az( t ) + E (t))  + Q(t) 

a a 


K(  t+1 ) ( a( t ) b(t) +Eab 


(t))]2 


K(t+l)(bz(t)  + Ibb(t))  + R(t) 


, K(N)  = F (4.3.14) 
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The  estimate  x(t|t)  in  Eq . (4.3.12)  is  the  condi- 
tional estimate  Elx(t)|zM  computed  via  some  optimal  nonlinear 
f i 1 ter . 

In  general,  the  cost-to-go  is  given  by 
V*(x(t),t)  = E{x2(t)  K(t)  + pCtJlz1)  (4.3.15) 


p(t)  = p(t+l)  + K( t+1 ) 5(t) 

[k(  t + 1 ) ( a( t ) b(t)  + E wU))]2 

+ =2 — * (t|t) 

K(t+l)(b^(t)  + Ebb(t))  + R(t)  xx 

p(N)  = 0 


(4.3.16) 


The  average  value  of  the  performance  index,  Eq . (4.2.11),  is 
given  by 

9 N-l 

J(0)  = K( 0)  E { x“( 0) } + l K( t + 1 ) (“( t ) 

t = 0 

+ (a(t) b(t) +Eab(t))  G(t)  rxx(t|t))  (4.3.17) 

using  the  fact  E{E{*|z)}  = £{•}. 

When  the  state  variable  x(t)  can  be  measured  exactly 
E{x(t)|z*}  becomes  x(t)  and  hence  the  term 


E \(  x( t ) — x ( t | t ) ) 


[k( t+1 ) ( a( t ) b(t)  + E , ( t ) ) ] , ) 

=5 ^ [(4.3.18) 

K(t  + l)(l/(t)  + E (U)  + R(t)1  \ 


vanishes  and  the  optimal  control  law  is 

u*(t)  = - C.(t)  x ( t ) (4.3.19) 

where  0(t)  is  given  by  Eq . (4.3.13).  These  results  for  the 
perfect  measurement  case  have  been  presented  in  Section  2.3. 
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We  remark  that  the  sain  in  the  optimal  controller 
for  the  stochastic  system  with  noisy  state  measurement  is  the 
same  as  the  gain  in  the  optimal  controller  when  the  state 
measurements  are  exact.  The  certainty-equivalence  controller 
is  not  the  optimal  controller  for  the  stochastic  system  with 
random  parameters.  The  control  gains  are  functions  of  the 
variances  of  the  white  parameteis.  In  this  case,  separation 
of  estimation  and  control  exists,  since  the  control  depends 
only  on  the  expected  value  of  the  current  state,  given  past 
measurements.  Separation  occurs  in  the  optimum  solution 
since  the  control  affects  only  the  conditional  mean  of  the 
state.  The  feedback  gains  in  Eq . (4.3.13)  can  be  calculated 
a priori  independent  of  the  filter  computations. 

The  optimum  controller  given  by  Eqs.  (4.3.12)  to 
(4.3.14)  "hedges"  or  acts  cautiously  or  vigorously  depending 
on  the  amount  and  type  of  uncertainty.  No  learning  of  the 
system  parameters  is  involved  in  the  estimation  process, 
however.  The  controller  gains  are  modulated  by  the  uncer- 
tainties of  the  parameters  and  exhibit  the  behavior  of  an 
adaptive  control  law.  Since  there  is  no  learning  in  the 
closed-loop  control  system,  the  control  is  non-dual  in  the 
sense  of  [8]  and  [22]  . 

The  conditional  probability  density  function  of 
x(t)  given  zl  is  in  general  very  difficult  to  evaluate.  A 
nonlinear  filter  is  required  which  is  usually  not  realizable 
for  practical  purposes.  We  will,  therefore,  examine  some 


A 
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approximate  solutions  to  the  stochastic  control  posed  in 
Section  4.2  by  fixing  the  structure  of  the  controller  and 
the  filter  to  be  linear. 

The  stochastic  control  problem  can  be  reformulated 
in  terms  of  the  state  estimate,  estimation  error,  and  error 
covariance  as  a deterministic  optimization  problem.  The 
parameter  optimization  problem  is  solved  first  using  the 
matrix  minimum  principle.  A true  two-point  boundary  value 
problem  (TPBVP)  results  because  the  control  now  affects  both 
the  mean  and  error  covariance  of  the  estimation  process.  We 
do  not  have  the  standard  separation  theorem  results.  This 
fixed  structure  controller-estimator  exhibits  the  dual  nature 
of  control  where  the  filter  gains  and  control  are  used  to 
improve  the  estimates.  This  subopt imal  solution  is  different 
from  the  optimal  solution  given  in  the  previous  Section  4.3, 
where  the  control  does  not  affect  the  variance  of  the  condi- 
tional estimator  as  contrast  with  a control  that  does  affect 
the  linear  minimum  variance  estimator.  For  simplicity  of 
filter  structure,  we  have  added  the  complexity  of  a policy 
dependent  estimator,  a true  tradeoff  in  implementing  a closed- 
loop  estimator-controller. 

Before  we  proceed  to  present  the  results  on  the  con- 
strained estimator-controller  suboptimal  control,  we  shall 
elaborate  further  on  the  concept  of  policy  independence  of 
the  conditional  mean  and  discuss  a control  based  on  the 
approximation  to  the  conditional  mean.  As  a result,  we  will 


1 
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derive  an  enforced  separation  controller  for  the  random 
parameter  system. 

If  the  conditional  mean  and  covariance  in  the  cost 


V( N-l ,x( N-l | N-l ) ) is  computed  via  the  minimum  variance  linear 
unbiased  filter  of  Chapter  3,  then  we  have 

V( x(N-l | N— 1 ) ,N-1)  = K(N-l)  x2(N-l|N-l) 

+ ^F(a2(N-l)  + Eaa(N-l))  + Q(n-l) J £XJ{(N-1 1 N-  1 ) 

+ F -(N-l)  (4.3.20) 

where 

x(N-l|N-l)  = (1  -H(N-l)  c(N-l))  x(N-l|N-2)  + H(N-l)z(N-l) 

(4.3.21) 

x(N-l | N-2 ) = a(N-2)  x(N-2|N-2)  + b(N-2)u(N-2)  (4.3.22) 

H(N-l)  = Ixx(N-l|N-2)  c(N-l)  p(N-l)  Exx(N-l|N-2) 

+ Ecc(N-l|N-l)  X(N-l)  + 0(  N-l  )J  (4  3.23) 


Z (N-l  | N-2 ) = a2(N-2)  Z (N-2|N-2)  + Z (N-2)  X(N-2) 


xx 


XX 


aa 


+ E. . (N-2)  u (N-2) + “(N-2) 
bb 


(4.3.24) 


E (N-l  | N-l ) = (1  -H(N-l)  c(N-l))2  E (N-l|N-2)  +H2(N-1) 

XX  xx 


• (E  (N-l)  X(N-l)  + 0( N-l ) ) 
cc 


(4.3.25) 


X(N-l)  = E{x2(N-l)| zN 


= (a2(N-2)  + E (N-2))  X(N-2) 

aa 


~2 . 


+ 2a(N-2)  b(N-2)  u(N-2)  x"'(N-l|N-l) 

+ (b2(N-2)  + Efeb(N-2))  u2(N-2) + 5(N-2)  (4.3.26) 


1 


r 
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Tho  estimation  error  covariance  depends  on  the  past 
control.  Hence  the  optimal  control  u(N-l)  which  minimizes 
V(N-2)  would  also  seek  to  minimize  the  estimation  error.  In 
other  words,  the  control  has  to  perform  the  dual  function  of 
control  and  estimation  of  the  state  and  leads  to  the  insepara- 
bility of  stochastic  control  and  estimation.  To  obtain  ad  hoc 
control,  we  can  assume  that  E (N-l|N-l)  is  independent  of  the 
control,  and  hence  obtain  the  enforced  separation  control  by 
minimizing  the  cost-to-go 


V( N-  1 ) 


min 
u( N-2 ) 


E{ K( N- 1 ) x2(N-1  ) + Q(N-2)  x2(N-2) 
+ R( N-2 ) u2(N-2) |zN~2} 


(4.3.27) 


and  obtain  that  the  subopt imal  control  is  given  by 
u( N-2 ) = - ti ( N - 2 ) x( N-2 | N-2 ) 
where  the  control  gains  are  the  same  as  those  given  by 
assuming  that  the  measurements  are  exact.  So, 

u(t)  = - C.(t)  x ( t 1 1 ) 

a(t)  b(t ) K ( t + 1 ) 

( b2( t ) + Ebb(t))  K( t + 1 ) + R(t  ) 

= (a2(t)  + Eaft(t)>  K( t + 1 ) + Q(t) 

(a(t)  b(t)  K(t-H)l2 

( b2( t ) + Ebb(t))  K( t + 1 ) + R(t) 

K(N)  « F 


0(t ) 

K(  t ) 


(4.3.28) 


( 4 . 3 . 29 ) 


(4.3.30) 


(4.3.31) 

(4.3.32) 


and  the  estimate  is  the  minimum  mean-square  estimate  given  in 
Chapter  3. 


! 


The  average  cost  for  this  enforced  separation 
solution  is  given  by 

o N-l 

J(  0)  = K(0)(Z  +x“)  + l K(  t+1 ) H(t) 

t=0 

+ a2( t ) K2( t+1 ) b2( t+1 ) ^R(t) 

+ (b2(t)  + Ebb(t))  K(t+l)J  1-  ^xx(t|t) 

(4.3.33) 

We  remark  that  there  has  been  other  types  of  sub- 
optimal  feedback  control  laws  considered  in  the  literature 
such  as  the  output  feedback  zero  memory  controller  in 
continuous-time  [41] , [43] . It  is  possible  to  cascade  an 
ad  hoc  scheme  based  on  the  Kalman  filter  and  the  deterministic 
control  law  given  in  Section  2.3.  The  Kalman  filter  is  to  be 
implemented  by  arbitrarily  setting  I ( t ) = I,  , ( t ) = £ (t)=0. 

3.3.  DD  CC 

The  resulting  filter  gains  would  not  reflect  the  level  of 
uncertainties  in  the  system  parameters. 

4 . 4 Formulation  of  the  Deterministic  Control  Problem 

In  this  section  we  will  find  an  approximate  solution 
to  the  optimal  stochastic  control  problem.  The  goal  is  to 
apply  standard  deterministic  optimization  techniques  to  the 
stochastic  control  problem  formulated  in  Section  4.2.  We 
will  assume  for  the  suboptimal  adaptive  feedback  compensation 
that  it  has  a linear  controller  cascaded  with  a linear  esti- 
mator.  We  shall  see  that  the  reformulated  problem  is  a 


deterministic  optimization  problem.  The  discrete-time 
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minitnum  principle  or  dynamic  programming  method  can  then  be 
applied  to  find  the  optimal  control  and  filter  gain  sequences. 

We  are  given  the  first-order  linear  stochastic  system 
Eqs.  (4.2.1)  and  (4.2.2)  with  quadratic  cost  functional  Eq . 

(4.2.11).  Assume  that  the  control  law  is  linear  in  the  state 

* 

estimate  and  time-varying  so  thnt 

u(t)  - - G(t)  £<t)  (4.4.1) 

where  x(t)  ts  the  best  linear  unbiased  estimate  to  be  deter- 
mined. In  general,  the  optimal  control  law  would  require 
Infinite  dimensional  state  estimators  as  we  have  seen  in 
the  previous  section.  We  will  thus  restrict  the  class  of 
admissible  control  functions  to  be  of  a certain  linear  struc- 
ture, Fig.  4.1. 

The  original  cost  functional  given  by  Eq . (4.2.11) 
is  then  rewritten  using  Eq . (4.4.1)  as 

J - EaFx2(N)  Q(  t ) x2(  t ) + R(t)  C,2(t)  x2(t)[  (4.4.2) 

1 t-0  ' 

Let  us  define  a random  vector  consisting  of  the 
state  variable  and  the  estimation  error  (which  are  dual  of 
each  other  in  the  standard  LQG  problem)  by  174] . 


*The  ust'  of  constant  linear  controller  leads  to  a different, 
static  minimization  problem. 
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Let  us  denote  the  symmetric  second  moment  matrix 
of  m( t ) as 


M(t)  = E{m( t ) m' ( t ) } = 


M00(t) 

M10(t) 


M01(t) 


MU(t) 


(4.4.4) 


The  cost  functional  then  becomes 
N-l  9 

J - FM00(N)+  l Q(t)  MQ0(t)  + R(t)  f/(t)(ll00(t)  -MQ1(t) 

- M10(t)  + Mn(t))  (4.4.5) 

The  transformed  cost  is  unconditional,  and,  in  fact,  is  a 
deterministic  quantity. 

To  reformulate  completely  the  original  stochastic 
control  problem  so  that  deterministic  optimization  techniques 
can  be  used  to  solve  the  problem,  we  need  to  derive  the  dy- 
namical equations  associated  with  the  matrix  M(t). 

We  shall  assume  that  the  desired  estimate  to  be  used 
in  the  feedback  control  function  in  Eq . (4.4.1)  is  a linear 
unbiased  estimate.  The  estimator  is  constrained  to  be  of 
the  form, 

x(t+l)  - D( t + 1 ) x(  t ) + H( t + 1 ) z(  t + 1 ) + L( t+1 ) u( t ) (4.4.6) 
Substituting  Eq . (4.4.1)  and  Eq . (4.2.2)  into  the 
state  Eq . (4.2.1)  and  the  filter  Eq . (4.4.6)  we  get 

x(t+l)  = a ( t ) x ( t ) - b(t)G(t)x(t)  + t(t)  (4.4.7) 

and 

x ( t + 1 ) = D(t+1)  x(t)  + H(t+1) e(t  + l)  x(t+l)  - L( t+1 ) G(  t ) x( t ) 

+ H( t + 1 ) 0( t + 1)  (4.4.8) 


Substracting  Eq . (4.4.8)  from  Eq . (4.4.7)  we  get 

x( t + 1 ) - x( t+1 ) = a(  t ) x(  t ) - D(  t+1)  x(  t ) - b(t)G(t)x(t) 

+ £(t)  -H(t+1)  c(t+l)  a(t)  x(t) 

+ H( t+1)  c( t+1 ) b(t)  G(t)  x ( t ) 

+ L( t+1 ) G(t)  x ( t ) - H( t+1 ) 6(t+l) 

= ((1  - H(t+1)  c(t+l))a(t)  -D(t+1))  x(t) 

+ D(  t+1 ) ( x(  t ) -x(t))  + C(t) 

- J^(  1 “ H(t+1)  c(t+l))  b(t) 

- L(  t+1 ) J G(t)  x(t) 

- H( t+1 ) 9(t+l)  (4.4.9) 

Improving  the  condition  that  the  estimate  be  un- 
biased of  x(t)  for  all  u(t),  i.e., 

E{x(  t)  - xU^z*}  =0  Vt  (4.4.10) 

implies  that 

D(  t+1 ) = (1  - H(t+1)  c(t+l) ) a(t)  (4.4.11) 

L(  t+1 ) = (1  -H(t+1)  c(t  + l))  b(t)  (4.4.12) 

and  that 

E{x(0)  - x( 0) } = 0 (4.4.13) 

or  x( 0)  = x . 

We  obtain  the  form  of  the  linear  unbiased  estimator 

x(t)  = (1 -H(t)  c(t))  (a(t-l)  - b(t-l)  G(t-l))  x(t-l) 

+ H(t)  z(t)  (4.4.14) 

driven  by  the  measurements. 

The  state  dynamics  can  be  rewritten  as 

x(t)  = (a(t-l)  -b(t-l)  G(t-l))  x(t-l)  +b(t-l)  G(t-l)(x(t-l) 
- x( t — 1 ) ) + S(t-l)  (4.4.15) 
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The  state  estimation  error  is  Riven  by 
x(t)-x(t)  = (1  -H(t)  c(t))  x(t)  -H(t)  0(t)  - (1  - H(t)  c(t)) 

• (a(t-l)  - b(t-l)  G(t-l))  x(t-l) 

= (1  -H(t)  c(t))  a(t-l)  x(t-l) 

- (1  - H(t)  c(t))  b(t-l)  G(t-l)*  x(t-l) 

• (1  -H(t)  c(t))  C(t-l)  - <1  -H(t)  c(t)) 

• (a(t-l)  -b(t-l)  G(t-l))  x(t-l)  - H ( t ) 0 ( t ) 

= (1  -H(t)  c(t>)  (a(t-l)  -b(t-l)  G(t-l))  x(t-l) 

+ (1  - H(t)  c(t)) b(t-l)  G(t-l)  (x(t-l)  -x(t-l)) 
+ (1  -H(t)  c(t))  C(t-l)  + (1  -H(t)  c(t))  (a(t-l) 
- b(t-l)  G( t-1 ) ) ( x( t-1 ) - x(t-l)) 

- (1  -H(t)  c(t))  (a(t-l)  -b(t-l)  G(t-l))  x(t-l) 

- H(t)  0 ( t ) (4.4.16) 

Wo  remark  that  the  estimation  error  x(t)  -x(t)  depends 

on  x(t)  and  z*  when  1 (t)^0,  E,.(t)^0,  or  I]  (t)^0.  This 

aa  ' bb  cc 

means  that  the  control  will  affect  the  estimation  performance, 
i.e.,  H x(t|t)  as  we  shall  see  in  the  following  development 
of  the  M(t)  matrix. 

In  the  derivations  below  we  shall  assume  that  a(t) 
and  b(t)  are  independent  to  simplify  the  algebra.  The  ele- 
ments of  the  second  moment  matrix  for  the  vector  m(t)  then 
propagate  according  to  the  following  difference  equations, 
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' 


M0()(t)  = E {(a(t-l)  - b(t-l)G(t-l))2}  M00(t-1) 

+ 2E  j(  a(  t-1 ) - b(  t-1  )G(  t-1 ) ) b(t-l)  G(  t-1 )}  M()l  ( t-1 ) 

+ E|b2(t-l)|  C.2(t-1)  Mn(t-1)  + S(t-l) 

= (a(t-l)  - b(  t-1  )G(  t-1  ) )2  M00(t-D  + £aa(t-l)  MQ0(t-l  ) 

+ >:bb(t-l)C.2(t-l)M00(t-l)  + 2b(  t-1 ) G(t-l)  (I(t-l) 

- b(t-l) G(t-l)) MQ1(t-l) 

- 25:bb(t-l)G2(t-l)M01(t-l) 

+ EbbC  t-1 ) G2( t-1 ) M11(t-1) 

+ b2(t-l)  G2(t-1)  M^t-l)  + H(t-l)  (4.4.17) 

M01(t)  = E{(a(t-1)  -b(t-l)G(t-l))  £(1 

- b(t-l)  G(  t-1)  ) - (1  -H(t)  c(t))  (a(t-l) 

- b(t-l)  G(t-l))]}M00(t-l) 

+ Ej(a(t-1)  - b(t-l)G(t-l))|^l  - H(t)c(t)  (a(t-l) 

- b(t-l)  G(t-l))J[  MQ1(t-l) 

+ E -|b(  t-1  )G(  t-1 ) j^<  1 - H(t)  c(t))(a(t-l) 

- b(t-l)  G( t-1) ) 

- (1  - H(t)  c(t))  (a(t-l)  -b(t-l)  G(  t-1 ) )J}mio(  t-1 ) 

+ E-|b(  t-1  )G(  t-1 ) [(1  - H(t)  c(t))  (a(t-l) 

- b(t-l)  G(t-l)) 

+ (1  - H(t)  c(t))  b(t-l)  G ( t - 1 )1 1 M1L(t-l) 

+ (1  - H(t)  c(t)) S(t-l)  (4.4.18) 


P 


-113- 


M0i(t)  * (1  -H(t)  c(t))  (ERa(t-l)  ♦ Ebb<  t-1 ) G2(t-1)) M00(t-1) 

♦ (1  -H(t)  c(t))  (a2(t-l)  - 2Ebb(t-l)  G2(t-1) 

- a(t-l)  b(t-l)  G(t-l))(M01(t-l)  + M10(t-1)) 

+ (1  -H(t)  c(t))  (a(t-l)  b(t-l)  G(t-l) 

+ Ebb(t-1)  G2(t-1))  Mn(t-1) 

_ (Concluded) 

+ (1  - H(t)  c(t)) H(t-l)  (4.4.18) 

after  some  algebric  manipulations. 

The  state  error  covariance  equation  is  given  by 

Mu(t)  = e|£(  1 - H(  t ) c(t))  (a  (t-1)  -b(t-l)  G(  t — 1 ) ) 

- (1  -H(t)  c(t))  (a(t-l)  -b(t-l)  G(t-l))J  2 1 M()0(t-1) 

+ 2E J~ ( 1 -H(t)  c(t)  (a(t-l)  - b(t-l)  G(t-l)) 

- (1  - H(  t ) c(t))  ( a( t-1 ) 

- b(t-l)  G(  t -1 ) Q 1 - H(t)  c(t))  (a(t-l)  j 

- b(t-l)  G(t-l)) 

+ (1  - H(t)  c(t))  b(t-l)  G(t-l)J|  M01(t-1) 

+ E jT(  1 -H(t)c(t))  (a(t-l)  - b(t-l)  G(  t-1 ) ) , ] 

+ (1  - H(t)  c(t))  b(t-l)  G(t-l)l  2|  Mn(t-1)  - ; 

‘ i 

+ ((1  - H(t)  c(t))2 + E (t)H2(t))  5(t-l)  ! 

CC  ! 

+ H2( t ) 0(t) 

= (1  -H(t)  F(t))2[^a2(t-1)  Mn(t-1) 

+ wt-i)G2(t-l)Mn(t-l)  + ! 

+ Ebb(t-D  G2(t-D  Moott-i)  (4.4.10) 
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+ 2Ebb( t-1 ) G2(t-1) MQ1(t-l)  + 5(t-l)J 

+ H2(  t ) Ecc(t)  j^(  a2(  t-1 ) + Eaa(  t-1 ) 

- 2a ( t - 1 ) b(t-l)  G(t-l) 

+ (b2( t-1)  + Ebb(t-1))  G2( t-1 ) ) MQ0(t-l) 
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Mlx(t)  = a2(t-l)  Mn(t-1)  + Eaa(  t-1 ) M00(t-1) 

+ Ebb( t-1 ) G2(t-1 ) • (MQ0(t-l)  - 2MQ1(t-l) 


+ M11(t-1))  + H(t-l) 


(4.4.23) 

Initial  conditions  for  the  dynamical  system  is  given 


by 


M00(0)  = x0  + EX0  ^ ° 


M01<°>  =£xOi0 


Mn(0)  - Zx0  > 0 


(4.4.24) 

(4.4.25) 

(4.4.26) 


Thus  we  have  formulated  the  following  deterministic 
optimal  control  problem.  Given  the  system  described  by  the 
dynamical  Eqs.  ( 4 . 4 . 20)-( 4 . 4 . 23 ) , the  initial  condition 


M(  0)  = 


2 . r. 

x0  + Ex0 


'xO 


x0 


x0 


(4.4.27) 


and  the  cost  functional 

N-l 


J = tr  ^FM(N)J  + l tr  ^Q(t)  M(t)J 


where 


(4.4.28) 


F = 


F 0 

0 0 


(4.4.29) 


Q(t)  = 


Q(t)  + R(t)  G2(t)  -R(t)G2(t) 


- R(t) G ( t ) 


R(t) G ( t ) 


(4.4.30) 


find  the  gains  G(t)  and  H(t)  such  that  J is  minimized. 


-116- 

This  problem  can  be  solved  using  the  matrix  minimum 
principle  or  dynamic  programming.  The  first  solution  using 
the  matrix  minimum  principle  is  summarized  in  the  following 
theorem . 

•1 . 5 Solution  of  the  Deterministic  Control  Problem 

Theorem  4.3.  Given  the  deterministic  dynamical 
system  Eqs.  (4.4.12)  to  (4.4.18)  and  the  cost  functional 
Eq.  (4.4.19),  the  optimum  control  and  filter  gains  are 
respectively  given  by 

* a(  t ) b(  t ) (Ecc(t+1)  H*2( t + 1 ) P*1(t  + 1)  + P*Q(t  + l)) 

G ( t ) = * * 

(b  (t)  + 51bb(t))(5:cc(t+l)H  z(t+l)P1;l(t  + l)+P00(t+l))+R(t) 

r — (4.5.1) 

+ Ebb(t)(l  -H  (t  + l)c(t+l))  P1;L(t+l) 

and 

H*( t+1 ) = [a2(t)  M^t)  + I&a(t)  M*Q(t) 

+ Ebb(t)  G*2(t)  (M*0(t)  -M*x(t) 

+ S(t))J  c(t+l)  / £c2(t+l)(a2(t)  M^(t) 

+ Eaa(t)M00(t)  + i:bb(t)G*2(t)(M00(t) 

- M*t(t))  + H(  t ) ) 

+ Ecc(t+1)  M*0(t+1)  + 0(t+l)J 

= M*1(t+1)  c(t+l)  flcc(t+l)  MqQ(  t + 1 ) + 0 ( t + 1 )J  -1 


(4.5.2) 
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r 

where  the  state  second  moment  equation  is  given  by 
M*0(t+1)  = (a(t)  - b(t)  G*(t))2  M*q(  t ) 

+ 2b(t)  G*(t)  (a(t)  -b(t)  G*(t))  M*1(t)  + 5(t) 

+ b2(t)  G*2(t) M*x(t)  + Eaa(t)  M*Q(t) 

+ rbb(t)  G*2(t)  (M*0(t)  - M^Ct))  , 

"oO<0>  * Ex0  + ;0  <4-5'3) 

The  state  estimation  error  covariance  equation  is  given  by 

Mll(t  + 1)  = d -H*(t+1)  F(t  + l))2[^a2(t)  M*x(t)  + Eaa(t)  M*0(t) 

+ £bb(t)  G*2(  t ) ( MqQ(  t ) -M*1(t))  + 5(t)J 

+ H*2(t+1)  ^cc(t+l)  M*0(t  + 1)  + 0(t+l)]  , 

MU(0)  = Zx0  (4‘5-4) 

* * 

and  the  co-states  P^^t)  and  P-^Ct)  are  propagated  backwards 
by  equations 

P*Q(t)  = (a2(t)  + Eaa( t ) )( Ecc( t + 1)  H*2(t+1)  P*x(t+1) 

+ PgQ( t+1) ) + Q(t) 

- G*2( t ) [(b2(t)  + Ebb(t))  ( zcc( t+1 ) H*2( t+1 ) 

Pll(t+1)  +P00(t+1))  + R(t) 

+ £bb(t)(l  - H*( t+1 ) c(t  + l))2P*1(t+l)J 

+ £aa(t)(l  - H*  ( t+1 ) c(t+l))2P*1(t+l) 

P*  (N)  = F (4.5.5) 


r 
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P*  (t)  = a2(t)(l  -H*(t+1)  c( t+1 ) )2  P*  ( t+1 ) 


+ g*2( t ) [(b2(t)  + zbb(t))(zcc(t+i)  H2( t+1 ) P*x(t+1) 


+ P*  (t  + 1))  + Z.  . (t)(l  - H( t+1 ) c( t+1 ) ) 2 


x PJ^t+l)  + 


R(t)  ] 


P21(N)  = o (4.5.6) 


Proof : See  Appendix  A. 


The  optimal  linear  time-varying  feedback  control  law 


is  thus 


u (t)  = - G (t)  x(t) 


(4.5.7) 


where  time-varying  gain  G (t)  is  given  by  Eq.  (4.5.1)  and  the 

a . 

linear  minimum  variance  unbiased  estimate  x(t|t)  is  given  by 
x(t+l)  = (1  -H*(t+1)  c(t+l))  (a(t)  -b(t)  G*(t))  x(t) 

+ H* ( t+1 ) z(t+l)  , x( 0)  = xQ  (4.5.8) 

and  z(t+l)  is  the  measurement  "driving"  term 


At  the  initial  time  (t  = 0) 

Moo(0>  - ZX0  + *0 


Mll(0>  - Ex0 


(4.5.9) 


(4.5.10) 


At  the  terminal  time  (t=N) 
P00(N)  = F 


PX1(N)  = 0 


(4.5.11) 


(4.5.12) 


The  fixed  structure  controller  is  shown  in  Fig.  4.2. 
Using  the  Matrix  Minimum  Principle,  we  have  obtained  the 
necessary  conditions  for  optimum  control.  To  compute  the 
optimum  control  gain  sequence  at  time  t,  we  need  P^(t+1), 
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P00^t  + 1^’  anf*  + Since  PQ0(t)  and  P^ft)  are  given  at 

the  terminal  time  N,  they  have  to  be  propagated  backwards 
from  time  N.  The  filter  gains  H(t  + 1)  depends  on  M0(1(t), 

M 1 ^ ( t ) , and  0(  t ) . Since  M^^ft)  and  M^(t)  are  given  at 
the  initial  time,  they  have  to  be  propagated  forward  in 
time.  The  solution  using  the  Matrix  Minimum  Principle  is 
a true  nonl inear  two-point  boundary  value  problem  (TPBVP) 
that  has  to  be  solved  by  iterative  methods. 

I f we  substitute  the  expression  for  H(t+1)  into 
the  forward  difference  equations  for  M^^ft  + l)  and  M^^t+l)  we 
see  that  they  are  coupled  nonlinear  difference  equations  in 
general.  In  the  special  case  where  £aa(  t ) = t ) = ^cc(  t ) = 0 , 
as  is  assumed  in  the  standard  linear-quadratic-Gaussian  prob- 
lem, the  MQ0(t)  and  M1X(*)  equations  becomes  decoupled. 

More  precisely, 

M00(t+1)  = a2(t)MQ0(t)-  2a(t)  b(t)  G(t)(M00(t) 

- Mn(t))  b2(t)  G2(  t ) (M00(t)  -M11(t)) 

+ H(t)  (4.5.13) 


where 

b(t) P (t+1) a(t) 

G(t)  = ^ 

l>  ( t ) P00(t+D  + R(t) 


(4.5.14) 


2 

Thus,  the  mean-square  of  the  state  MQ^(t)=E(x  ( t ) 1 depends 
on  the  error  covariance  quantities  Mj^(t).  But,  the  co- 
variance  is  completely  decoupled  from  the  second  moment  of 
the  state  since 
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2 ,—2. 


M11(t+1)  = (1  -H(t+1)  c(t+l)r  (a*(t)  M^Ct)  + H(  t ) ) 


+ H ( t+1 ) 0( t+1 ) 


(4.5.15) 


This  is  just  the  measurement  update  covariance  equation  in 
the  Kalman  filter. 

Equation  (4.5.13)  for  M^q( t ) is  the  mean  square 
history  of  the  state  variable  x(t). 

MQ0(t+l)  = (I(t)  -b(t)  G(t))2(MQ0(t)  - Mlx(t))  + S(t) 

+ a2(  t ) M1;L(t)  (4.5.16) 

This  is  the  same  result  obtained  in  ([75],  Eq . 4.7.30). 

Let  us  now  analyze  the  co-state  equations 
and  P11(t).  If  we  let  E (t)  = E.,(t)  = E (t)=0,  we  obtain 

II  3.3.  DD  CO 

-2  I2b2p2  (t+1) 

P (t)  = a2(t)  Pnn(t+1)  +Q(t)  - — (4.5.17) 

° ° b2  PQ0(t+l)  + R(t) 

This  is  just  the  nonlinear  Riccati  difference  equation  en- 
countered in  discrete-time  deterministic  optimal  control 
problem.  We  know  that  the  solution  exists  and  is  unique 
and  finite  if  the  system  is  controllable. 


The  deterministic  co-state  equation  for  P^(t)  is 


given  by 


Pu(t)  = a2(t)(l  -H(t+1)  c(t  + l))2P11(t+l) 
b2(t)  I2(t)  PqQ(  t+1 ) 
b2(t)  PQ0(t+l)  + R(t) 

Since  in  the  case  where  the  parameters  are  known 
H( t+1 ) = M11(t+1)  c(t+l) 0-1(t+l) 


(4.5.18) 


(4.5.19) 


P^(t)  is  still  coupled  to  the  M^(t)  equation,  but  is 
uncoupled  from  the  PQ^( • ) equation. 

In  tb'-  1 inear-quadratic-Gaussian  problem  Mj^(t) 
and  PqqC^)  are  used  to  compute  the  optimal  filter  gains 
and  control  gains,  respectively.  The  PQQ  forward  and 
backward  difference  equations  are  completely  uncoupled 
from  each  other.  This  is  a very  fortunate  situation.  The 
two-point  boundary  value  problem  can  be  solved  as  two  single- 
point boundary  value  problems. 

The  fact  that  the  co-state  Pqq( t ) is  the  solution 
of  the  Riccati  equation  when  the  system  parameters  are  known 
perfectly  suggest  that  it  has  some  physical  interpretation. 

If  we  think  of  the  co-states  P(t)  as  the  gradient  of  the 
cost  with  respect  to  the  state  variables  as  in  the  Hamilton- 
Jacobs ' -Bellman  approach,  i.e., 

P(t)  - (4.5.20) 

then  it  is  evident  that  the  co-state  equation  defines  the 
evolution  of  the  partial  derivatives  3J/3Mgg(t)  and 
3J/3M11L(t)  for  t e [0,N]  . 

From  the  expression  for  the  average  value  of  the 
quadratic  cost  functional,  Eq . (4.4.28) 

N-l  o 

J = FM00(N)  + £ Q(t>  M00(t)  + R(t)  G (t)(M00(t)  -M11(t)) 

(4.5.21) 

If  we  now  add  Pqq(  0)  M^q(  0)  and  P^(  0)  ^ ( 0)  outside 

the  summation  and  compensate  this  by  adding  the  terms 
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. — | 

’ 1 


poo(t+1)  Moo(t+1>  - poo(t)  Moo(t)  and  Pll(t+1)  Mn(t+1>  - 

P^(t)M^(t)  inside  the  summation,  the  expression  is  not 
changed.  We  get 

N-l 

J = P00(0)  MQ0(0)  + P11(0)  M11(0)  + ^Q(t)M00(t) 

+ R(t)  G2(t)  (MQ0(t)  - Mn(t))  + P0()(t  + 1)  M00(t  + 1) 

- PQ0(t)  Mco(t) + P11(t+1) Mxl(t+1) -P^U)  M21(t) 

(4.5.22) 

Now  we  substitute  into  the  above  equation,  the 
expressions  for  MQ0(t  + l),  P00(t),  M11(t+1)>  and  P1;L(t) 

M00(t  + 1)  = (a(t)  - b(t)  G(t))2  (MQ0(t)  -Mxl(t)) 

+ a2( t ) M12(t)  + H(t)  + Saa(t)  MQ0(t) 

+ Ebb(t)G2(t)(M00(t)-M11(t))  (4.5.23) 

Equations  (4.5.4),  (4.5.5),  and  (4.5.6)  respectively,  we 
obtain  that 

N-l 

J = P0Q(0)  Mqo(0)  + Pn(0)  M11(0)  + I p00(t  + 1)  5(t) 

+ Mn(t)  j^2a(t)  b(t)  G(t)  PQ0(t  + l)  - ((b2(t) 

+ sbb(t))  P00(t+D  + R(t))  G2(t)J 

- M0()(t)  [(a(t)  - b(t)  G(t))2  Ecc(t+1)  H2(  t+1 ) 

+ (Eaa(t)  +G2(t)  Zbb(t))  (Ecc(t+1)  H2(  t+1 ) 

+ (1  - H(t+1)  H(t+1))2)]  Pn(t+1) 

+ Pn(t+l){(l  - H(  t+1 ) c(t+l))2  ( a2(  t ) Mn(t) 

+ Zaa(t)  M00(t)  + Ebb(t)  G2(t)(M0Q(t)  (4.5.24) 


_ d 
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- Mu(t))  + H ( t ) ) 

+ H2(  t+1 ) Ecc(t+1)  T(a(t)  -b(t)  G(t))2  (M00(t)  -Mn(t)) 

+ a2(  t ) MX1(  t ) + S(  t ) + Eaa(  t ) Moq(  t ) 

+ Zbb(t)  G2(  t ) (M00(t)  -Mn(t))J  + H2(  t + 1 ) Q(  t + 1 )| 

- £a2(t)(l  - H(  t+1 ) c(t+l))2Pu(t+l) 

+ G(t)  a(t)  b(t)(Ecc(t+l)  H2(  t + 1 ) P11(t+1) 

-i  (Concluded) 

+ P00(t+l)J  Mxl(t)  (4.5.24) 

Most  of  the  terms  cancel,  we  get  as  a result  the  optimal  cost. 

N-l 

j = P00(0)  M0o(0)  + P11(0)  M11(0)  + E p00(t+1)  5(t) 

+ Ptl(t+1)  [d  -H(t+1)  C(t  + l))25(t) 

+ H2(t+1)  Ecc(t+1)  H(t)  + H2(t+1)  0(t+l)J  (4.5.25) 

In  the  well-known  linear-quadratic-Gaussian  problem, 
the  average  cost  is  given  by 

J - P00(0>*W0)t  X Poo<t+1)  5<t) 

+ P00(t+1)  b(t)  G(t)  a(t)Mu(t)  (4.5.26) 

where 

G(t)  = b(t)  PQ0(t+l)  a(t)  / (b2(t)  PQ0(t+l)  +R(t))  (4.5.27) 

In  this  case,  if  we  define 
Pn(t)  = I2(t)(l  - H(t  + 1)  c(t+l))2P11(t+l) 

+ b(t)  PQ0(t+l)  a(t)  G(t)  , P11(N)=0  (4.5.28) 
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then 

N-l 

J = P00(°)  M00(0)  + Pn(0)  M11(0)  + 2 P00(t+1)  5(t) 

+ Pn(t+1)  [(i  - H(t+1)  c(t+i))2  E(t) 

+ H2(t  + 1)  0 ( t + 1 )J  (4.5.29) 

where 

H( t + 1 ) = Mlt(t+1)  c(t  + l)  0-1(t+l)  (4.5.30) 

The  average  cost  in  the  stochastic  control  problem 
is  composed  of  terms  due  to  the  initial  state  uncertainty 
and  due  to  the  plant  noise  £(t)  and  measurement  noise  9(t). 

We  remark  that  the  form  of  the  optimal  cost  obtained 
here  is  the  discrete-time  equivalent  of  that  obtained  in  the 
solution  to  the  two-controller  team  problem  in  [74]. 

Sufficiency  conditions  for  optimality  may  be  ob- 
tained from  the  second  partial  derivatives  of  J with  respect 
to  G and  H.  Taking  the  derivatives  of  9J/9G  and  9J/9H  we 
then  obtain  that  the  sufficient  conditions  for  a strong 
minimum  are 

(i)  (b2(t)  + £bb(t))  (P0Q(t  + l)  + Icc(t+1)  H2(  t+1 ) Pxl(t+1)) 

+ £bb(t)(l  - H(t  + 1)  ?(t  + l))2P11(t  + l)  + R(t)  > 0 

(4.5.31) 

(ii)  M0()(t)  - Mn(t)  > 0 (4.5.32) 

( iii ) c2(t)  ftlx(t)  + 0(t)  + Ecc(t)  M00(t)  > 0 (4.5.33) 

We  remark  that  in  condition  (i),  the  randomness  in 
the  parameter  b(t)  introduces  mathematically  equivalent 
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control  penalties  into  the  control  problem.  Hence  if  R(t) 
is  selected  wrong,  then  E^Ct)  can  '3e  uset*  account  for 
the  error.  In  condition  (iii)  E ^ ( t ) MQ()(  t ) is  positive 
setnidefinite  if  M^^(t)  is  positive  semidef inite.  The  product 
will  increase  the  effective  weighting  [0( t ) + E ( t ) MQ0( t ) ] 
that  needs  to  be  inverted  in  Eq . (4.5.2).  So  the  randomness 
in  the  parameters  b(t)  and  e(t)  effectively  make  the  solution 
more  stable  numerically. 

We  note  that  if  Q(t)=0,  then  PgQ(t)=0  if  ZaR(t)= 

E (t)=0,  but  Pn  (t)^0  if  E (t)  or  E (t)  is  nonzero.  In 
cc  00  aa  cc 

the  case  PgQ(t)=0  and  R(t)=0,  the  control  gain  G(t)  in 
Eq.  (4.5.1)  may  still  be  a well-defined  quantity  due  to  the 
uncertainty  in  c(t),  (Ecc(t)^0). 

In  the  special  case  when  the  measurements  are  exact 
so  that  0(t)  and  E^c(t)  = 0,  then  the  equations  for  the  opti- 
mal stochastic  control  problem  Eqs.  (4.5.1)  to  (4.5.6)  re- 
duces to  the  same  results  obtained  in  Chapter  2. 

Problem  Solution  Using  Dynamic  Programming 

We  have  seen  that  the  minimum  principle  gives  the 
necessary  conditions  for  the  minimization  of  the  quadratic 
cost  function  Eq . (4.4.28).  It  reduced  the  optimum  systems 
control  problem  to  a nonlinear  two-point  boundary  value 
problem.  The  solution  yields  an  optimum  open-loop  control. 
For  the  standard  linear-quadratic  (regular)  problem,  the 
two-point  boundary  value  problem  can  be  replaced  by  solving 
a Riccati  difference  equation  to  obtain  the  gains  of  the 


* 

I 

. s 


closed-loop  system.  In  general,  the  set  of  difference 
equations  may  not  be  solved  in  a straightforward  manner  and 
this  remark  applies  to  Eqs.  (4.5.1)  to  (4.5.12). 

A direct  method  to  solve  the  optimization  problem 
is  the  dynamic  programming  algorithm  [7] . Discrete  dynamic 
programming  is  essentially  the  repeated  sequential  (stage  by 
stage)  application  of  the  Hami lton-Jacobi  equation  (continuous 
dynamic  programming)  or  the  Bellmans'  Principle  of  Optimality 
[7] . From  the  solution  of  dynamic  programming  we  immediately 
know  the  cost-to-go  function  as  well  as  the  closed-loop  con- 
trol and  optimum  trajectory.  Dynamic  programming  method 
minimizes  directly  the  given  cost  functional  and  thus  a 
Riceati  equation  without  introducing  a two-point  boundary 
value  problem.  However,  it  generally  requires  guessing  the 
form  of  the  solution  to  the  functional  equation. 

We  give  now  an  useful  alternative  method  of  solu- 
tion to  the  optimum  control  problem.  The  objective  of  the 
closed-loop  optimal  stochastic  control  system  is  to  minimize 
the  average  cost  functional, 

l 9 T— 1 2 2 / 

J = E^x  (T)F  + l Q( t ) x (t)+R(t)u  (t)[  (4.5.34) 

« t=0  ' 

where  both  x(t)  and  u(t)  are  random  sequences  subject  to  the 

y 

system  dynamics 

x(t+l)  = a(t)  x(t) + b(t)  u(t)  + 5<t)  (4.5.35) 

The  state  is  measured  imperfectly  according  to  equation 

z(t)  = c(t)  x(t)  + 0 ( t ) 


(4.5.36) 
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The  expectation  in  Eq.  (4.5.36)  is  taken  with  respect  to 
random  variables  x(0),  £(t),  9(t),  a(t),  b(t),  and  c(t). 

In  the  suboptimal  design  of  the  stochastic  control 
system,  we  will  restrict  our  attention  to  linear  controllers 
and  linear  filters.  Using  this  approach  necessary  optimality 
condition  are  derived  using  the  dynamic  programming  method. 

We  are  interested  in  control  laws  having  the  form 

u ( t ) = - G(t)  x(t)  (4.5.37) 

where  G(t)  as  before  is  a time-varying  linear  control  gain  to 
be  determined.  The  best  estimate  x(t)  is  a priori  specified 
to  be  given  by  the  recursive  equation 
x(t+l)  = a(t)x(t)+  b(t)u(t) 

+ H( t+1 ) £z(t+l)  — c(t+l)  x(t+l)J  (4.5.38) 

x(t+l)  = a(t)x(t)+b(t)u(t)  (4.5.39) 

t | 

where  H(t+1)  is  the  time-varying  filter  gain  to  be  determined.  t ' 

Notice  that  we  restrict  ourselves  to  considerations  of  a 

specific  controller-estimator  structure  and  optimize  the  j 

I 1 

choice  of  "control"  sequences  G(t)  and  H(t)  over  the  param- 
eter space.  r j 

i j 

Equation  (4.5.37)  specifies  that  the  admissible  ^ ! 

j j 

class  of  control  that  will  be  allowed  in  the  optimization  . 

explicitly.  The  structure  of  Eq . (4.5.37)  is  a mathemati- 
cally realizable  control.  The  control  u(t)  at  any  time  t 
depends  on  all  information  available  up  to  time  t.  The 
information  set  is  (zt,ut  *}  = {z( 1) ,z(2) , . . . ,z( t) ,u(0) , . . . , 


u(t-l)}.  Mathematically,  the  u(t)  is  a linear  map  of  all 
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past  measurements  and  controls,  and,  perhaps,  of  time  t. 

We  expect  to  make  future  observations  (from  time  t on)  and 
that  the  future  controls  will  be  functions  of  these  measure- 
men  t s . 

The  stochast ic  control  problem  will  be  stated  for- 
mally now.  Given  the  dynamic  system  Eq . (4.2.1)  and  the 
observation  Eq.  (4.2.1),  the  information  set  (z( .u1"' ) find 
the  control  law  in  the  class  specified  by  Eq . (4.4.1)  such 
that  the  "average  cost-to-go"  given  by 

4 N “ 1 ry  n , , ) 

J0(T)  = E •)  F x (N)  + [ Q(  t ) x 1 1 ) + R(  t ) u ( t ) Zr,ur  ^ 

(4.5.40) 

is  minimum.  The  weightings  are  Q(t)  : 0,  FiO,  and  R(  t ) > 0. 

The  statistical  properties  of  the  additive  noises  £(t)  and 
0 ( t ) and  purely  random  (white)  parameters  a(t),  b(t),  and  c(t) 
are  the  same  as  those  assumed  in  Section  4.2. 

We  show  in  Appendix  B.  that  the  optimum  solution 
obtained  by  applying  the  dynamic  programming  algorithm  is 
the  same  as  that  given  in  Theorem  4.3. 

4 . G Discussion  of  the  Optimal  Linear  Controller 

We  remark  here  that  the  solution  in  terms  of  coupled 
nonlinear  two-point  boundary  value  problem  was  also  obtained 
in  1 74]  which  considered  the  decentralized  control  of  linear 
systems  with  different  information  sets.  It  was  also  pointed 
out  that  in  the  general  case 


The  filter  derived  in  [74]  is  not  the  Kalman  filter,  although 
it  is  linear  and  unbiased.  In  our  problem  solution,  the 
orthogonality  condition  assumption  allowed  the  solution  to 
be  solved  analytically.  This  same  conclusion  was  made  by  [76]. 

It  can  be  seen  from  Eqs.  (4.5.1)  and  (4.5.2)  for  the 
gains  G(t)  and  H(t)  that  the  product  of  the  state  and  co- 
state P11(t)  M11(t)  play  an  important  role.  Note  that  H(t) 
depends  mainly  on  M(t),  while  G(t)  depends  mainly  on  P(t). 

In  the  deterministic  case,  G(t)  depends  only  on  PQ0(t)  and 
H(t)  depends  only  on  M^^(t).  The  uncertainty  in  the  param- 
eters reflected  by  E (t)^0,  E,,  /0,  and  E (t)^0  has 
coupled  the  state  and  co-states  together. 

The  gain  G(t)  resembles  the  filter  gain  G(t)  for 
the  deterministic  LQG  problem  except  that  0(t)  is  replaced 
by  [0( t ) + E ( t ) Mqq( t ) ] . The  co-state  M00(t)  now  plays  an 
important  part  in  the  filter  gain  computation.  Even  with 
perfect  (noise-free)  measurement,  the  measurement  will  be 
weighed  accordingly  because  of  the  multiplicative  noise  in 
the  measurement  equation.  In  the  deterministic  LQG  case, 

H(t)  depends  only  on  0(t)  the  measurement  error  covariance. 
Futhermore,  M^1(t)  depends  on  P(t)  through  the  control  gains 
G(t). 

The  control  gains  G(t)  are  similar  to  the  G(t) 
given  in  Eq . (4.5.27)  except  that  P^Q(t),  the  solution  to 
the  Riccati  equation,  has  been  replaced  by  expressions 
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involving  both  P(  t ) and  M(  t ) , i.e.,  t ) + (<(  t ) H“(  t ) 

I'*  1 j (.  t ) ] . They  are  no  longer  the  deterministic  optimal  con- 
trol gains,  but  depend  on  the  error  covariances  of  the  state 
est i mates . 

The  equations  for  G(t)  and  H(t)  are  complicated 
expressions,  so  we  shall  consider  some  of  the  special  cases. 
Remark  4.1.  If  E__(t)  = 0,  E,,(t)^0,  and  E._(t)^0,  then  we 

1 CC  DD  Qil 

have  essentially  the  results  of  Chapter  2,  control  of  linear 

stochastic  systems  with  perfect  measurements  (0(t)=O). 

Rema rk  4.2.  If  E^^(t)  = 0,  then  this  says  that  the  control 

input  has  a deterministic  multiplier.  To  reduce  Eq . (4.5.1) 

to  the  pure  estimation  problem  ( E (t)^0,  E (t)^0),  set 

(lU  c c 

R( t ) = 0 , so  that 

G ( t ) = (4.6.2) 

b(t) 

and  the  closed- loop  system  parameter 

a(t)  - b(t)  G(  t ) = 0 (4.6.3) 

The  Eqs.  (4.5.3)  and  (4.5.4)  for  the  error  covariance  then 

= Eaa(t)  MQ0(t) + 5(t)  + a2(t)  M11(t)  (4.6.4) 

= a2(t)  Mlx(t)  + Eaa(t)  MQ0(t)  + H(t)  (4.6.5) 

= (1-H(t  + 1)  c(t+l))2p(t)M11(t)+  Eaa(t)MQ0(t) 
+ =(t)l  + H2(  t+1 ) [scc(t+l)  M00(t+1)  + 0(t+l)J 

= (1  - H(t+1)  c(t+l))  M11(t  + 1)  (4.6.6) 


evolves  as 

Moo<t+1) 
Mu(t  + 1) 

M11(t+1) 


_ _ — _ — — 
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The  perfect  control 

u(t)  = - x ( t ) (4.6.7) 

b(t) 

drives  the  estimated  state  to  zero  just  prior  to  measurement 
update , i . e . , 

x(t+l)  = a(  t ) x(  t ) - b(t)G(t)x(t)  = 0 (4.6.8) 

and  the  state  estimate  evolves  as 

x(t)  = H(t)  z(t)  (4.6.9) 

since  the  predicted  state  estimate  x(t)=0. 

Note  that  in  this  case,  the  optimal  gains  are  in- 
dependent of  the  state  weightings  Q(t)  used  in  the  original 
cost  functional.  Only  a single-point  boundary  value  problem 
need  to  be  solved  to  compute  the  optimal  filter  gain  se- 
quence since  the  filter  equations  have  been  uncoupled  from 
the  co-state  equations  P(t).  Since  the  optimal  gains  are 
independent  of  the  data,  they  may  be  pre-computed  off-line 
given  the  noise  statistics. 

We  remark  that  since  the  control  in  this  case  may 
be  written  as 

u(t)  = - H(t)  z(t ) (4.6.10) 

b(t) 

it  is  a linear  function  of  the  measurement  z(t)  and  H(t). 

This  is  an  example  of  the  nonclassical  information  pattern, 
Wittsenhausen  [4] . The  controller  is  a zero-memory  con- 
troller without  perfect  recall. 

Remark  4.3.  The  presence  of  the  uncertainty  E (t)  and 
Z (t)  in  the  parameters  a(t)  and  b(t)  multiplying  the 


-133- 


state  x(t)  tends  to  destabilize  the  system.  This  is  readily 
seen  from  Eqs.  (4.6.5)  and  (4.6.6)  since  the  variance  can  be 
destabilized  by  large  E and  high  gain  H(t). 

HR 

M11(t+l)  - a2(t)  (1  - H(t)  c(t))2  Mu(t)  + £aa(t)  Mxl(t) 


+ H(t) + 


H2( t ) £ 


Ecc(t)MU(t)  + G( 


9(t)J  aJ 


(4.6.11) 


This  result  is  very  intuitive  and  cautions  one  against  using 
arbitrarily  high  gains  in  the  closed-loop  system. 

Remark  4.4.  The  stochastic  singular  control  problem 
(Ebb(t)=0>  R(t)  = 0),  represents  the  formal  dual  to  the 
optimal  stochastic  control  with  perfect  estimation  discussed 
in  Chapter  2.  To  see  this,  we  write  for  the  optimal  filter 


gain 


H ( t ) = Mu(t)c(t)  c ( t)  Mn(t)  + Ecc(t)  MQ0(t)  + 0(t) 


(4.6.12) 


where 


Mlx(t)  » a^(t-l)  M11(t-1)  + Eaa(t-1)  Mxl(t-1)  + 5 ( t — 1 ) 


(4.6.13) 


since  Mn(t)  = M0Q(t) . 


The  predicted  error  covariance  then  satisfies  the 


equat ion 


Mn(t+1)  = a2(t)Mn(t)-  a2(t)  H(t)  c.(t)  Mn(t) 
+ E (t)  M,  1 ( t ) + H (t) 

il cl  1 1 

using  Eq . (4.6.12) 


(4.6.14) 


I 
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«u(t+1) 


(H2Ct)+Eaa(t » "ll(t)  * "(t) 


(4 .6.15) 


xna 

c(t)M,,(t>  (4.6.16) 

H(t'  ’ 

The  equations  are  the  ior.nl  duals  to  the 

d (2  3 13)  tor  the  optimal  stochastic  control 
(2.3.12,  and  <3.3.  „ote  that  the  linear  feedback 

— TsU>  - the  optimal  solution  .hereas 
control  Biven  by  «•  • ' give„  by  Eqs.  (4.6.12, 

the  linear  unbiased  ^ to  the  original 

and  (4.6.13)  i duality  relationship 

— control  prob  e - ^ ^ ^ _ 

between  the  perfect  estimatio 

trol  problem  is  only  formal.  see  that  the 

result  of  Chapter  3 , we  s 
Recalling  th  estimator 

for  the  linear  unbiased  minimum  varianc 

reSUlt"  1 ent  a dual  to  the  optimal  stochastic  con  *o 

did  not  represent  a _n  chapter  2 

problem  with  perfect  measurement  cons  t„.t 

i-  onr  estimation  problem,  it  wa 
K„r  the  optimal  Unelt  e „ith  constraints  on 

the  *»*  pr0bl“  18  “ C ln  the  solutions  are  presented 

the  states.  The  similarity  in 

in  sankaran  and  Srinath  1171  ■ 
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dynamic  compensator.  The  structure  of  the  linear  estimator- 
controller  is  given  in  Fig.  4.2.  We  discussed  in  more  de- 
tail the  solution  to  this  problem,  i.e.,  coupled  Riecati- 
type  equations.  We  note  from  Eq . (4.5.5)  that  P^^t)  is 
uncoupled  from  the  Pnn(t)  equation  if  E (t)=0,  Vt,  and 
the  measurement  data  is  noise  free.  In  the  noisy  sensor 
measurement  case,  P^(t)  is  uncoupled  from  the  PQ0(t)  equa- 
tion if  the  covariances  E ( t ) = E,  . ( t ) = 0 ; and  this  is  the 

aa  bb 

standard  linear-quadratic-Gaussian  problem.  The  assumption 
of  randomly  varying  parameters  in  the  dynamic  system  has 
coupled  the  "state"  M and  "co-state"  P together.  The  solu- 
tion of  a matrix  two-point  boundary  value  problem  will  yield 
the  optimal  gains  of  the  dynamic  compensator.  The  optimal 
controls  are  not  given  by  the  separation  theorem. 

We  then  considered  several  special  cases  for  the 
dynamic  system  with  purely  random  (white)  parameters.  We 
discussed  a case  of  deadbeat  control  problem  in  discrete- 
time systems.  The  optimal  control  gains  is  independent  of 
Q in  the  cost  function.  They  may  be  computed  a priori  given 
the  noise  statistics.  The  solution  is  applicable  to  the 
"stochastic"  singular  control  problem;  and  only  a single 
point  boundary  value  problem  needs  to  be  solved.  The  sto- 
chastic singular  control  problem  is  the  dual  of  the  control 
problem  with  exact  measurements  considered  in  Chapter  2; 
hence  one  can  replace  in  the  solution  equations  given  in 
Section  2.3  the  symbols  (a,  E ) by  (c,  E ),  K by  M1 ..  , 
and  G by  H. 


4 . 7 Optimum  Stationary  Linear  Control 


In  Section  2.4,  we  showed  that  the  infinite  horizon 
solution  to  the  optimal  control  of  dynamic  systems  with  un- 
certain parameters  and  exact  measurements,  does  not  exist 
if  the  parameter  uncertainty  exceeds  a certain  quantifiable 
threshold.  We  call  this  the  uncertainty  threshold.  For 
dynamic  systems  with  randomly  varying;  parameters  and  noisy 
sensor  measurements,  we  seek  the  threshold  parameter  associ- 
ated with  the  infinite  horizon  problem. 

In  this  section  we  will  investigate  the  question  of 
the  existence  of  steady  state  linear  optimal  stochastic  con- 
trols for  the  random  parameter  problem.  We  assume  that  the 
system  has  stationary  statistics  so  that  for  the  random 
parameters 


E{a(t)}  = a 

cov(a( t ) , 

a(T>>  ' Eaa 

<S(t,T) 

(4.7.1) 

E(b( t ) } = b 

cov(b( t) , 

b(O)  - ibb 

S(t,T) 

(4.7.2) 

E{c(t) } = c 

cov{c( t ) , 

' Ecc 

<5(t,T) 

(4.7.3) 

and  additive  noises 

cov{£;(t),  £(t)}  = 5 6 ( t , t ) 
cov{ 0 ( t ) , 0 ( t ) } = G 6(t  ,t) 

We  will  examine  the  existence  and 
steady-state  control  for  the  infinite-time 
trol  problem  by  analyzing  the  solutions  to 
difference  equations, 


(4.7.4) 

(4.7.5) 
finiteness  of 
stochastic  con- 
the  forward 
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MQ0(t  + l)  = (a-bG(t))  Mq()(  t ) + 2bG(t)(a-bG(t))Mn(t) 

+ E + b2  G2(  t ) Mn(t)  + Eaa  M00(t)  + EbbG2(t) 


* (M00(t)  " Mn(t)) 


(4.7.6) 


M00(0)  - £X0  + X0 


+ EbbG2(t)  (M00(t)  -M11(t))  + E 
H(  t + 1 ) = M1]L(t  + l)  c [c2  M1;l(t  + 1)  + £cc  M00(t  + 1)  + 0] 


M11(t  + 1)  = (1  -H(t+1)  c (t  + l)T  M1;L(t+l) 


+ H ( t+1 ) ( Ecc  MQ0(t+l)  + G) 


(4.7.7) 


(4.7.8) 


(4.7.9) 


Mxl(0)  = E? 


and  backward  difference  equations. 


p00(t)  ■ (a2t£aa)(!:cCH2(t+1)Pll(t+1)+P00<t+1)  +Q 

- G2(t)[(b2  tllb)(lcc  H2(t  + 1)  Pxl(t+1)  +P00(ttl>) 

+ R + (1  - H(t  + l)c):P11(t  + l)  EbbJ  (4.7.10) 


P00(N)  = Q 


Pu(t)  = a2(l  -H(t+1)  c)2P11(t  + l) 


G2(t)  [r  + (Ebb  + b2)(Icc  H2(t+1)  Pn(t+1) 


+ poo(t+1)) 


+ Ebb(l  - H(  t+1 ) c)2P11(t  + l)J 


(4.7.11) 


P00(N)  = ° 
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where 


G(t) 


a b(EccH2(t+l)  Pn(t+1)  + P0Q(t+l)) 

<52  * Ebb><  Ecc  h2<  * + 1 > Pll< t + 1 ) + P00< t + 1 ) > * R 


(4.7.12) 

+ Ebb(l  - H( t+1 ) c)  P11(t+1) 

We  can  obtain  the  necessary  conditions  for  the 

existence  of  the  steady-state  solution  to  the  difference 

equations  by  assuming  that  as  time  extends  to  infinity  in 

both  directions  (that  is  (tQ  -*■-«>,  N -*•  +°°)  that  Pqq.  P.^,  ^qo’ 

and  are  the  steady-state  values. 

G and  H can  be  eliminated  from  Mqq,  M.^,  and  Pqq 

and  P^  equations  to  obtain  a system  of  quadratic  equations 

in  and  M,.  and  P...  and  P., , separately.  Simultaneous 

00  11  00  11 

solutions  of  two  quadratic  equations  requires  solving  a 
quart ic  equation.  Hence,  the  algebraic  solution  to  the 
linear  stationary  system  is  intractable  mathematically  in 
closed  functional  form  except  by  numerical  methods. 

An  alternative  approach  to  the  algebraic  solution 
of  the  quartic  equation  resulting  from  a system  of  quadratic 
equations  is  the  solution  method  of  successive  approximation. 

In  particular,  we  propose  to  solve  the  coupled  nonlinear 
difference  equations  using  the  control  iteration  method. 

This  essentially  means  that  we  start  with  an  initial  guess 
of  the  solution  G(t)  gain  sequence  to  be  used  in  computing 
the  forward  difference  equations  Mqq(1)  and  M^1(t).  The 


-139- 


computed  solution  H(t)  sequence  is  stored  on  the  forward 
pass.  On  the  backward  pass  the  stored  H(t)  are  used  to  solve 
the  backward  difference  equations  P^it)  and  P^(t);  the 
control  gains  G(t)  are  stored  on  the  backward  pass.  These 
forward-backward  steps  are  iterated  until  the  solutions 
converge  to  some  convergence  criterion  chosen  (0.001  in  our 
case)  and  the  average  cost  stops  to  change  significantly. 

The  simulation  results  are  used  to  guide  the  analy- 
sis of  the  coupled  nonlinear  difference  equations  M^^Ct), 

M 1 1 ( t > ’ Pqo^  t ) - anc*  P-^it)  that  have  to  be  solved  to  obtain 
the  optimal  control  gains  and  filter  gains.  If  the  measure- 
ments are  exact  and  £ =0,  the  stability  results  of  Sec- 

cc 

tion  2.4  apply  to  the  optimal  stochastic  control  problem 
since  all  equations  reduce  to  the  perfect  measurement  case. 

We  now  give  the  following  theorem. 

Theorem  4.4.  For  the  linear  stationary  system,  if  the 
quantity 


—2 

a^  + £ 


I2  b2 

a a 7”  2 y 

b + £ 


> 1 


(4.7.13) 


bb 


then  the  Riccati-type  equation  P0Q(t)  diverges  as  N becomes 
+ *>  . The  resultant  closed-loop  control  system  is  unstable 
in  mean-square  sense. 

Proof : From  Eq . (4.5.5) 

P00<O  - (“2  + !’na)(E0c  ,|2<  t + 1 ' p,j( t + 1)  + pQ0(  t + 1)  + Q 


(4.7.14) 


-140- 


P00(t)  ~ ,r2 


a2  b2(EccH2(t+l)  Pn(t+1)  + PQ0(t+l))2 


(b  +Ebb)(J:ccH  (t+1)Pll(t+1)  +P00(t+1))  +R 


+ Ebb(l  - H(  t+1 ) c)2P11(t+l) 

9 (Concluded) 

+ Eaa(l  -H(t+1)  c)":P11(t+l)  (4.7.14) 

Adding  Ecc  H2( t ) P2^( t ) to  both  sides,  and  define  P = PQ0 

+ E^  H2?...,  we  obtain  that 
cc  11 


P(t)  = (a2  + E ) P(t+1)  + Q 

cl  cl 

a2  b2  P^t+1) 

(b2  + Ebb)  P(t+1)  + R+  Ebb(l  - H(  t+1 ) c)2  P1;L(t+l) 

+ (Eaa(l  -H(t+1)  c)2  + E(jcH2(t))P11(t)  (4.7.15) 


P(t)  > (a2  + Iaa)  P(t+1)  - 


— ? — 9 ^ 2 

* b2P^(t+l) 

(b2  + Ebb)  P(t+1)  + R 


+ Q (4.7.16) 


We  have  proved  in  Section  2.4  for  the  perfect  measurement 
a Riccati  equation  of  the  form  above  has  a finite  solution  if 
and  only  if  the  means  and  covariances  of  the  random  parameters 
satisfy  the  condition 


aa 


-2  r-2 

a b 


b2  + E 


< 1 


bb 


(4.7.17) 


We  have  obtained,  therefore,  a sufficient  condition 

A 

for  the  Riccati-type  equation  for  P(t)  to  diverge  for  the 
infinite-horizon  stochastic  control  problem. 


N 
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A 

If  P(t)  diverges,  we  may  have  the  case  that  only 
P()l)(t)  diverges  while  Pj.^(t)  converge.  But  this  is  not 
possible  from  Eq . (4.7.11).  We  can  also  have  the  case  that 


P^p(t)  converges  and  P^(t)  diverges.  Again  this  is  not 

possible  from  Eq.  (4.7.10).  Hence  we  can  only  conclude 

that  both  P0^(t)  and  P^^t)  diverge  together. 

Remark  4.5.  Consider  the  special  case  E = E,.  =0,  then 
aa  bb 

the  co-state  equations  simplify  to 

P00lt)  ■ a2(IccH2(ttl)Pll(t+l)  + P <t+l))  + Q 


Pu(t) 


a2  b2(EccH2(t+l)  Pix(t+1)  + PQ0(t  + l))2 
b2(Ecc  H2(  t+1 ) Pn(t+1)  + P00(t+1))  + R 


a2(  1 - H(  t+1 ) c)2P11(t  + l) 

a2  b2(EccH2P11(t  + l)  -HPQ0(t  + l))2 
52<£ccH2pil(t*1)tP00lt  + 1))*R 


(4.7.18) 


(4.7.19) 


Note  that  Eq . (4.7.18)  is  just  the  standard  Riccati 

equation  for  the  linear  quadratic  control  problem,  P^^t) 

does  not  diverge  independent  of  what  P.^(t)  does.  If  P.^(t) 

—2  —2 

diverges,  then  P^^(t)  approaches  (a  /b  R +Q)  as  In 

other  words,  the  Riccati  equation  P^^t)  converges  for  any 
value  of  E 

cc 

If  the  co-state  P^tt)  diverges,  then 
Pn(t)  = a2(l  - H(t+1)  c)2  Pn(t  + 1)  + a2  Ecc  H2(t  + 1)  Pn(t  + 1) 


> ^2 


cc 


—2,y 

c + E 


11 


P11(t+1) 


cc 


cc 


(4.7.20) 
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I 

I 

1 

1 

i 

1 


A sufficient  condition  is  that  (a2  E^c/c2  + ^cc ) > * for  Pji(t) 

to  diverge.  In  deriving  the  inequality  above  we  have  claimed 

—2 

that  the  minimum  variance  filter  gain  is  given  by  Ecc/(c  +E^c). 
This  can  be  readily  deduced  from  the  filter  equations.  Note 
that  Eq . (4.7.20)  is  the  same  condition  we  derived  for  the 
linear  minimum  variance  estimator  in  Eq.  (3.4.12). 

Remark  4.6.  In  the  special  case  that  Ebb  = 0,  then  we  have  that 

Pll(t)  = a2(l -H(t+1)  c)2P11(t+l) 

a2  b2( E H2( t+1 ) P^U+l)  + P (t+1))2 
+ ££ oo (4.7.21) 

b2(ZccH2(t+l)  pn(t+1)  +p00(t+1))  + R 

—2  —2 

If  the  homogeneous  part  of  P.^(t)  diverges  then  a (^cc/c 
+ £cc)  > 1 • The  co-state  equation  is  given  by 

P00(t>  - <;2tEaa)<EccH2<t+1>PU(t+1>  + P00(t+1>>  * « 
a2  b2(EccH2(t+l)  Plx(t+l)  + P00(t+1))2 
b2(IccH2(t+l)P11(t+l)  +P00(t+1))  + R 

+ E (1 -H(t+1)  c)2P11(t+l)  (4.7.22) 

3.  £L  Jl  -1. 

This  is  not  in  the  form  of  the  standard  Riccati  equation.  The 
inequality  condition  of  Eq.  (4.7.12)  is  still  a sufficient 
condition  for  divergence,  however. 

Remark  4.7.  In  the  case  that  E = 0,  we  have  then  the  co-state 
aa 

PQ0(t)  = a2(EccH2(t+l)P11(t+l)  +P00(t))  + Q 

a2  b2(EccH2(t+l)  Pn(t+1)  + P00(t  + 1))2 
~ (b2  + Ebb)(EccH2(t  + l)P11(t+l)  +p00(t+l))  +R 

— = (4.7.23) 

+ Ebb(l  - H(  t + 1 ) c ) Pn(t+1) 
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and 

P11(t)  = a2(l  - H(t+1)  c)2P11(t+l) 

I2  b2(ZccH2(t+l)  P11(t+1)  + PQ0(t+l))2 
+ (b2*Ebb)(EccH2(t  + l)Pu(t+l)  +P00(t+1) 

—2 (4.7.24) 

+ R + Ebb(l  - H(t  + 1)  c)  P1;L(t  + l) 

The  sufficient  condition  for  divergence  as  given  by  the 
inequality  Eq . (4.7.13)  holds  in  this  case  ( l =0). 

8l3L 

Remark  4,8.  For  the  lack  of  an  analytical  result  on  the 
asymptotic  stability  of  closed-loop  stochastic  control 
system,  we  turned  to  simulations  to  guide  the  analysis. 
Solutions  to  the  state  and  co-state  equations  were  obtained 
by  the  method  of  successive  approximation.  Solution  values 
for  PQ0(t),  P11(t),  MQQ(t),  and  M1^(t)  are  recorded  to 
determine  the  limiting  solution  value  in  case  they  converge. 

For  a particular  system  (£  =1.0,  c = 1.0,  a = l.l,  b = 1.0), 

Fig.  4.3  gives  the  stability  and  instability  regions  for  the 
random  parameter  system.  We  see  that  for  certain  combinations 
( F , I..  ) the  steady-state  solution  to  Pnn(t)  and  P11(t)  does 

3.3.  D D UU  11 

not  exist  because  the  uncertainties  are  larger  than  some 
threshold  for  the  closed-loop  system. 

I f we  draw  in  the  curve  for 
_o  -2  t-2 

a2  + E - ~ p = 1 (4.7.25) 

““  b2  + I.. 

bb 

it  will  be  much  above  the  computed  stability  curve  in  Fig. 

4.3  since  it  is  only  a sufficient  condition. 


Now  if  we 


Figure  4.3  Computed  stability  region  for  system  given  by  equations  (4. 


-145- 


draw  in  the  curve  (see  Fig.  4.4) 

Z 


—2  v 
m„  = a + E 


cc. 


a a 


—2  r 
c + Z 


- 1 


cc 


-2  .-2 
a b 

Z + b2 
bb 


= 1 


(4.7.26) 


it  will  be  somewhat  below  the  computed  stability  curve  in 
Fig.  4.3  so  that  if  m9  is  satisfied,  then  the  closed-loop 
system  is  asymptotically  stable.  We  conjecture,  for  now, 
this  is  a sufficient  condition  for  the  existence  of  a steady- 
state  solution.  (This  is  the  output  feedback  stability  analy- 
sis result  obtained  in  the  next  section.)  The  modification 

_ 2 

in  Eq . (4.7.26)  is  motivated  by  the  appearance  of  (1  - He) 
in  the  P equations.  Since  the  expression  actually  occurs 
squared  we  then  revised  the  conjecture  to  be  (see  Fig.  4.5) 


a2  + Z + 
aa 


cc 


+ Z 

cc 


- 1 


a2  b2_ 
Zbb  + ^ 


(4.7.27) 


and  this  is  a tighter  upper  bound  curve  on  the  stability- 
region  for  this  special  set  of  parameter  uncertainties. 

The  behavior  of  a stable  closed-loop  system  in  the 
mean-square  sense  is  given  in  Fig.  4.6.  We  note  that  the 
steady-state  region  is  the  interval  where  all  the  "co-states" 
?Oo(  t ) anc*  and  "states"  M^^t)  and  are  at  a 

constant  value.  In  this  interval,  the  controller  has  con- 
stant gains  and  the  filter  has  constant  gains,  Fig.  4.7. 

Note  that  there  are  some  transient  behavior  or  endpoint 


effects  associated  with  the  numerical  solutions. 
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sbb 

Figure  4.4  Lower  bound  on  the  stability  region  defined 
by  equation  (4.7.26)  for  system  given  by 
equations  (4.4.1)  and  (4.4.14) 
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Figure  4.6  Behavior  of  the  states  and  costates  given 
by  equations  (4.7.6)  to  (4.7.12) 
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In  Fig.  4.8  we  show  what  happens  to  the  solution 
values  of  PQ0(t),  P-^Ct),  MQ0(t),  and  in  an  unstable 

closed-loop  system.  The  solution  values  for  all  four  vari- 
ables increases  monotonically  and  for  all  practical  purposes 
diverge . 

Remark  4.9.  The  effect  of  uncertainty  in  the  parameter  c is 
investigated  in  Fig.  4.9.  For  E = E,  . =0,  the  covariance 
of  c contributes  to  the  destabilization  of  the  closed-loop 
system  when  the  parameters  are  known  with  certainty.  In  the 
case  illustrated  a = 1.1,  b = 1.0,  and  c = 1.0,  the  co-state 
P^Ct)  becomes  exponentially  large  when  Ecc  exceeds  the  value 


In  Fig.  4.10  we  show  the  effect  of  E „ > 0,  E =0 

cc  aa 

on  the  uncertainty  threshold  developed  in  Chapter  2. 


m = a^  + E 


-2  -2 
a b 

aa  r-2  , 

b + E, 


(4.7.28) 


It  is  intuitively  obvious  that  the  effective  threshold  is 

higher,  that  is,  there  is  less  tolerance  for  the  uncertainty 

in  the  parameters  b in  order  for  the  closed-loop  system  to 

be  asymptotically  stable.  We  show  similarly  in  Fig.  4.11, 

for  1^  = 0,  the  level  of  uncertainty  E the  closed-loop 
bb  aa 

system  will  tolerate  is  smaller  than  the  perfect  observation 
case.  Figures  4.10  and  4.11  can  be  compared  with  those  of 
Figs.  2.2  and  2.3. 

The  larger  the  covariance  of  b,  ceteris  paribus, 
the  smaller  the  magnitude  of  the  control  gain  and  the  larger 
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Figure  4.8  Behavior  of  the  divergent  states  and 

costates  given  bv  equations  (4.7.6)  to 
(4.7.12) 
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Figure  4.10  Solution  of  the  costate  equation 
(4.7.23)  for  known  a(t)=a=l.l 


1 

l 

) 

i 

t 

i 

i 


i 


COSTATE.  Poo<») 


‘^25 


PH  '■"'"•.r 


-155- 

the  filter  gain  in  general.  The  controller  is  exercising 
caution  in  control,  since  the  input  is  being  applied  with 
larger  uncertainty  about  the  mean.  The  multiplicative  noise 
on  the  input  adds  to  the  total  disturbance  in  the  system 
dynamics  equation. 

The  larger  the  covariance  of  a,  ceteris  paribus, 
the  larger  the  magnitude  of  the  control  gain.  This  is 
intuitively  obvious  since  the  control  wants  to  exercise 
more  probing  to  reduce  the  uncertainty  in  the  state.  The 
filter  gain,  ceteris  paribus,  is  also  larger  for  larger  E 

tl& 

The  multiplicative  noise  on  the  state  effectively  increases 
the  plant  noise  in  the  estimation  problem.  This  says  that 
the  correction  from  the  measurement  update  will  be  larger. 

The  larger  the  covariance  of  c,  ceteris  paribus, 
the  smaller  the  filter  gain.  The  random  parameter  c multi- 
plying the  state  effectively  increases  the  additive  measure- 
ment noise  0.  The  control  gain  is,  however,  larger  in  magni- 
tude as  the  adaptive  control  will  use  the  input  u(t)  to  re- 
duce the  uncertainty  in  the  state. 

As  we  can  readily  see  from  the  numerical  simulation 
that  the  random  parameter  stochastic  control  system  behaves 
as  a non-learning  adaptive  control  system.  All  future  mea- 
surements are  available  for  the  stochastic  control  and  esti- 
mation. The  control  law  appropriately  regulate  the  system 
over  the  time  horizon  to  minimize  the  average  of  the  devia- 
tion of  the  state  from  zero  and  control  effort;  and  this 
control  involves  no  parameter  identification. 


■■mm 
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The  control  input  u(t)  affects  the  estimation 
process  and  the  estimation  performance  affects  the  amount 
of  control  action  necessary  to  regulate  the  system.  The 
system  is  not  neutral.  Caution  and  probing  is  an  important 
functional  part  of  the  controller.  The  control  gains  are 
modulated  by  the  covariances,  which  are  in  term  affected 
by  the  control  action. 

The  value  of  information  for  the  stochastic  control 
problem  in  general  is  defined  as  the  difference  between  the 
expected  cost  J^,  the  best  we  can  do  with  the  information 
and  Jg , the  best  the  controller  can  do  without  the  informa- 
tion. This  value  of  information  provides  a measure  of  how 
the  performance  of  a random  parameter  system  is  degraded 
when  we  assume  that  nature  specifies  the  system  parameters 
at  all  times. 

To  obtain  a comparison  of  the  cost  among  the  several 
control  schemes,  the  constrained  controller-estimator  of 
Section  4.4,  the  certainty-equivalent  controller,  the  en- 
forced separation  controller,  and  Kalman  filter-perfect 
estimation  controller,  one  could  proceed  with  a Monte  Carlo 
simulation  of  the  closed-loop  system. 

Remark  4.10.  For  stable  systems  where  |a|  < 1,  it  is  observed 
from  the  simulation  results  that  if 

(a2^aa)<l  <4-729> 

the  closed-loop  system  converges  for  any  values  of  means 


and  covariances. 


I 
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I f (a~  + E^)  > 1,  then  the  solution  values  of  P^^(t) 
and  P^(t)  divei'ges  for  certain  combinations  of  the  means 
and  variances  of  the  parameters.  In  general,  the  sufficient 
condition  Eq . (4.7.15)  holds  for  the  original  stable  as  well 
as  unstable  systems. 


If  S = E,  , =0,  then  there  is  no  possibility  of 
aa  bb 

divergence  since  the  stability  region  is  above  the  curve 


—9 

c + E 


—9 

a“  = 1 


(4.7.30) 


For  the  stationary  system,  we  consider  the  perfect 


control  problem  presented  in  Section  4.6.  The  existence  of 
a solution  to  the  stochastic  singular  control  problem  depends 
on  the  existence  of  positive-definite  solution  of  the  alge- 
braic Riccati-type  equation. 

—2  —2  '2 

M - - ( t+1  ) = (a2  + E ) M.-(t)  + 5-  _9a“  ° M5t} (4.7.31) 

1 aa  ii  (c  + E ) M(  t ) + 0 

The  critical  points  of  this  type  of  algebraic  equation  was 
discussed  in  Section  2.4.  By  identifying  with  K in 

Eq . (2.4.1)  the  following  result  can  be  stated. 

Theorem  4.5 

If  the  means  and  covariances  of  the  random  parameters  are 
such  that 


—9 

a^  + E 


-2-2 
a c 

—n 

c + E 


(4.7.32) 


then  a non-negative  definite  solution  of  Eq . (4.7.31)  exists. 


k 
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Proof : The  proof  is  similar  to  that  given  in  Section  2.4. 

This  inequality  condition  can  be  analyzed  in  the 
, same  manner  as  for  the  perfect  estimation  case.  The  stochas- 

tic singular  control  system  is  stable  if  and  only  if  the 
inequality  in  Eq . (4.7.32)  is  satisfied. 

The  covariance  may  be  written  as, 

M11(t+1)  = (1  -H(t)  c)2M11(t)  + Laa  M11(t)  +H 

+ a2  H2(t)  [ecc  M11(t)  + 0]  (4.7.33) 

where  [(l-H(t)  c)a]  is  the  closed-loop  system  parameter 
and 

H(t)  = Mlx(t)  c [(c2  + Ecc)  Mtl(t)  + 0(t)]_1  (4.7.34) 

When  E = E =0,  the  sufficient  condition  for  stability  is 
aa  cc 

that  M.j^(t)  be  stable.  It  is  well-known  that  in  general  if 

the  system  is  observable,  then  the  propagation  of  the  co- 

variances  will  converge  to  some  steady-state  value;  and  this 

is  true  for  a scalar  system. 

When  E #0  and  E ^ 0,  then  stability  of  the  co- 
aa  cc  ' 

variance  equation  depends  on  the  level  of  uncertainty  in  the 
parameters  a(t)  and  c(t).  Note  that  both  uncertainties  de- 
stablize  the  covariance  propagation  equation.  The  destabiliz- 
ing effect  due  to  Ecc  will  be  greater  since  it  is  multiplied 
by  the  square  of  the  filter  gain. 

Conclusions 

In  this  subsection  we  summarize  the  key  results 
obtained  in  Section  4.7.  We  are  interested  in  seeking  a 
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threshold  condition  for  the  infinite  horizon  problem;  and, 
hence,  the  existence  of  steady-state  control  law.  Since  the 
coupled  nonlinear  Riccati-type  matrix  difference  equation  is 
computationally  complex  to  solve  analytically,  we  used  the 
control  iteration  method  to  simulate  the  system  of  equations 
in  the  two-point  boundary  problem.  We  were  able  to  immedi- 
ately obtain  a sufficient  condition  for  the  solution  to  the 

coupled  Riccati-type  equations  to  diverge  for  infinite-horizon 

«► 

problem. 

Next  we  proceeded  to  investigate  some  special  cases. 

1)  ^aa  = = ^ Riccati-like  equation  for  P^Ct)  always 

has  a limiting  solution,  2)  E f 0 , E,,  =0,  P-.-.Ct)  may  diverge 

ct<*  DD  UU 

as  N -*•<»,  and  3)  E =0,  E,  f 0 , P _ ( t ) may  diverge  as 

aa  DD  U U 

The  computed  (simulated)  stability  region  curve  is  then 
presented  in  Fig.  4.3.  Some  conjectures  on  the  sufficient 

I 

conditions  for  mean-square  stability  are  given  in  Figs.  4.4 

I 

and  4.5.  The  uncertainties  in  the  random  parameters  have  a I 

destability  effect  on  the  dynamic  system,  in  moving  the  1 

i 

effective  poles  outside  the  unit  disk.  This  is  argued  as 
follows.  The  uncertainty  in  a increases  the  magnitude  of 

, \ 

the  control  gain.  The  uncertainty  in  b increases  the  magni- 

I 

tude  of  the  filter  gains.  The  uncertainty  in  c reduces  the  1 

filter  gains,  but  it  increases  the  control  gains  since  the 
variance  E f 0 is  effectively  additional  control  weight 
in  the  co-state  equations. 
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If  the  random  system  is  originally  stable,  we  can 

say  something  more  about  the  mean-square  stability  of  the 

linear  control  system.  The  feedback  system  is  stable  if 

(a“  + E )<1.  If  E = E =0,  then  the  fixed  structure  con- 
aa  a a bb 

trol  system  is  always  stable. 

For  the  stochastic  singular  control  system,  we  ob- 
tained the  sufficient  condition  for  mean-square  stability 
under  feedback;  which  is  the  dual  to  the  case  with  exact 
measurements  (Ecc  = 0,  0 = 0).  If  this  threshold  condition 
is  violated,  then  the  optimal  solution  to  the  infinite  hori- 
zon problem  does  not  exist. 


•1 . 8 Stability  of  Stochastic  Dynamical  Systems 

In  this  section  we  will  follow  by  analogy  with  the 
method  of  analysis  in  Section  2.5  and  derive  the  conditions 
for  the  asymptotic  stability  of  the  closed-loop  system.  In 
particular,  we  shall  deal  with  the  stochastic  difference 
equation 

y(t+l)  = a( t ) x( t ) + b( t ) u(  t ) (4.8.1) 

where  the  linear  output  feedback  law 

u(t)  = g(t)  y ( t ) (4.8.2) 


and  output 


y( t)  = c(t)  x(  t) 


(4.8.3) 


then 


( 


y(  t+1 ) = [a(t)  + b(t)  g(t)  c(t)]  x(t)  = $(t)  x(t) 


(4.8.4) 
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The  propagation  of  the  second  moment  of  x is  given 


E{x2( t+1 ) } = E{a2(t)  + b2(t)  g2(t)  c2(t) 

+ 2a(t)  b(t)  g(t)  e(t)} E{x2(t)} 

= E{ a2( t ) } + g2(t)  E{b2(t)  c2(t)} 

+ 2g( t ) E { a( t ) b(t)  c(t)}]  E{x2(t)}  (4.8.5) 

2 

E(x-  .Ct+1-)}  = E { 4>2(  1 ) } E{<}>2(2)}  . . .E{t))2(t)}  = S(t)  (4.8.6) 
E{x  ( 1 ) } 

The  minimum  of  S(t)  is  obtained  if  each  term  is 
minimized  for  all  t.  Thus,  taking  the  algebraic  minimization 
we  get  that 


g(t)  = - 


E { a(  t ) b(t)  c ( t ) } 
E { b2( t ) c2( t) } 


(4.8.7) 


Substitute  this  result  into  Eq . (4.8.6)  we  get  the 


minimum  value  of  6(t)  is 


S(t)  = a ( t ) - 


(4.8.8) 


t>  (t)  c ( t) 


In  the  case  where  the  system  parameters  a(t)  and  b(t) 
are  uncorrelated  with  the  measurement  parameter  c(t)  as  has 
been  assumed  in  Section  4.3,  we  then  have 


g (t)  » 


(4.8.9) 


b"(t)  c (t) 


The  minimum  value  of  S(t)  is  then,  assuming  the  random  param- 
eters are  wide  sense  stationary, 


r 


S(  t)  = a2  + E 
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(,b^ab)2c2 

Sa  <Ebb  + B2)(Ecc+S2> 


4 m1 


(4.8.10) 


The  variance  of  x(t)  is  bounded  if  and  only  if 


m < 1 


(4.8.11) 


If  we  rewrite  this  result  as 


—2  , _ 

m = a + E + 
aa 


E + c2 
cc 


<Eab*ab)2 

<Ebb  + b2> 


(4.8.12) 


We  find  that  this  threshold  m differs  from  the  m in  Eq . (2.4.3) 

in  the  expression  ( ( E /c2  + E )-l).  We  note  that  if  E =0, 

CC  cc  cc 

then  m reduces  to  m.  Effectively,  driving  the  system  in 
Eq . (4.8.1)  using  direct  output  feedback  represents  a worst- 
case  analysis. 

In  other  words,  we  can  improve  on  this  sufficient 
condition  for  mean  square  stability  by  using  any  reasonable 
control  law.  This  is  verified  when  we  use  the  linear  unbiased 
estimator  of  a fixed  structure  given  by  Eqs.  (4.5.2)  to  (4.4.4) 
In  principle,  we  have  then  derived  the  lower  bound  on  the 
actual  stability  curve  for  the  closed-loop  system  given  by 
Eq.  (4.8.12). 

We  remark  that  from  Eq . (4.8.12)  if  a2  + E <1  and 
E^iO,  then  the  stable  system  (4.8.1)  is  again  stabilizable 
under  feedback.  Mathematically,  this  says  that  for  the  com- 
bination of  means  and  covariances  that  satisfy  inequality 
(4.8.12)  also  satisfies  the  true  threshold  condition.  The 


converse  is  not  true.  The  inequality  condition  in  Eq . (4.8.12) 
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is  only  a sufficient  condition.  This  is  illustrated  in 
Fig.  4.4.  Superimposing  Fig.  4.4  on  Fig.  4.3  would  show 
that  the  stability  region  curve  given  by  Eq . (4.8.12)  is 
below  the  computed  mean-square  stability  region  curve  in 
Fig.  4.3.  Hence,  it  is  not  surprising  to  see  from  that 
the  stability  curve  of  Eq.  (4.8.12)  in  Fig.  4.4  is  lower 
than  the  experimental  curve  in  Fig.  4.3  obtained  from 
simulations . 

Consider  now  the  case  £^  = 0,  then  the  threshold 

m becomes 


m = £ + 

aa 


cc 


£ + c 

cc 


a2  = a2  + £ 

—2  aa 


-2  -2 
a c 

v . -2 

£ + c 

cc 


(4.8.13) 


and  from  Eq.  (4.6.9) 

g*  = - n— a p-~2  (4.8.14) 

b<Ecc  + C > 

If  £ =0,  we  have  the  stability  condition  £ <1. 

cc  J aa 

—o 

In  the  case  £ f 0,  if  a + £ <1,  then  the  system  is  stabi- 

cc  aa 

lizable  under  linear  feedback  for  all  levels  of  parameter 
uncertainty . 

We  have  stated  that  the  inequality  condition  in 
Eq.  (4.8.11)  is  only  a sufficient  condition  for  mean-square 
stability,  the  gain  in  Eq . (4.8.14)  does  not  correspond  to 
the  limiting  control  gain  obtained  from  the  TPBVP , i.e., 

lim  G(t)  = --j  (4.8.15) 

b 


-164- 


which  is  independent  of  E and  c.  This  is  obvious  since 
* 

the  g here  is  based  on  output  feedback.  So  that  when 


E =0, 

cc 


* a 

g = - irz 
b c 


(4.8.16) 


If  in  addition  E =0,  then 
aa 


m 


c2  + E 


cc  —2 
a 


(4.8.17) 


cc 


Hence,  the  closed-loop  system  is  mean-square  stable  for  all 

| a. | <1.  In  the  perfect  estimation  problem  m = 0,  of  course. 

If  E =0,  but  E.  . > 0,  then  the  sufficient  condition 
aa  bb 

for  mean-square  stability  becomes 


m = a2  + 


cc 


E + c2 
cc 


- 1 


I2b2_ 

Zbb  + b' 


(4.8.18) 


and 


♦ 

g = - 


a c 


(Ibbtb2><Ecc  + c2> 


(4.8.19) 


For  | a | <1,  the  system  is  stabilizable  under  linear  feedback. 

We  conclude  that  the  really  interesting  cases  to 
—2 

study  are  systems  with  (a  + E ) > 1 and  m < 1.  The  destabi- 

EL  3. 

lizing  effect  of  the  variance  of  c(t)  is  manifested  in  this 
range  of  values. 

Stochastic  Stability  Using  Fixed  Structure  Controller 

The  next  problem  to  examine  at  this  point  is  if  the 
parameter  uncertainties  are  such  that 


I———  I I 
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a2  + z 


a a 


(ab+Eab)2c2 
-2, 


> 1 


(4.8.20) 


(£bb  + b )(Zcc  + c > 
can  the  stochastic  system  with  random  parameters  still  be 

/v 

mean-square  stabilizable  under  linear  feedback  (u(t)=gx(t)) 
We  will  attempt  to  formulate  this  problem  in  the 
subsequent  analysis.  We  propose  to  use  a linear  unbaised 
estimator  for  the  state  in  the  closed-loop  controller,  i.e., 


x(t)  = (1  - h c)  x(t-l)(a  + b g)  + hy(t) 

where 

yt  t ) = c(t)  x(t) 
and  the  closed- loop  system  is 

x(t+l)  = a ( t ) x ( t ) + b(t)  g(t)  x(t) 

The  naive  estimate  of  the  form 

x(t)  = — y(t)  = — x(t) 


(4.8.21) 


(4.8.22) 


(4.8.23) 


(4.8.24) 


There  fore , 


u(t)  = g(t) — x(t) 

c 


(4.8.25) 


Substituting  this  into  Eq . (4.8.1)  we  get  that 
x(t+l)  = £a(  t)  +b(t)  g(t)  J x(  1 ) (4.8.26) 

Minimizing  the  variance  of  x(t),  we  obtain  that 


^ |a2(t)  + b2(t)  g2(t) 


+ 2a( t ) b(t) g(t) 


£S±ll  . „ 

c ) 


— 2 


• - — _ ~ 
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The  resulting  control  law  is  given  by 

u(t)  = - x ( t ) (4.8.28) 

.2  2 

b c 

We  note  that  this  is  the  same  control  law  as  using 
the  direct  output  feedback.  Hence,  all  the  previous  results 
follow  (Eq.  (4.8.11)).  It  is  obvious  from  Eq . (4.8.21)  that  if 

h = (4.8.29) 

c 

then  Eq.  (4.8.24)  follows.  Therefore,  the  output  feedback 
control  is  equivalent  (identical)  to  h = l/c. 

In  the  linear-quadratic-Gaussian  problem  we  are  able 
to  examine  the  necessary  and  sufficient  conditions  for  the 
existence  of  stabilizing  gains.  In  the  time-invariant  case, 
the  characteristic  values  of  the  closed-loop  system  comprise 
the  characteristic  values  of  [a-bg]  (the  regulator  poles) 
and  the  characteristic  values  of  [a  - h c]  (the  estimator  poles). 
Overall  system  stability  then  requires  the  poles  to  be  inside 
the  unit  circle.  For  the  random  parameter  system,  the  cas- 
caded system  poles  do  not  comprise  of  those  of  the  deter- 
ministic optimal  control  problem  and  those  of  the  optimal 
estimation  problem  since  the  Separation  Principle  no  longer 
is  true.  Hence,  we  need  a separate  analysis  and  a measure 
of  stochastic  stability  to  consider. 

We  want  to  analyze  the  stability  of  the  fixed  struc- 


ture control  system  and  obtain  a tighter  lower  bound  on  the 
stability  region  curve.  We  have  that 
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I 


x(t  + l)  = a(t)  x(t)  + b(t)  g(  t)  x(t) 


where 


x(t)  = (1  - h(t)  e)  x(t  1 1-1 ) + h(t)  c(t)  x(t) 
From  Eq . (4.8.30)  we  have  that 
x(t+l|t)  = ax(t)  + bg(t)x(t)  = (a  + bg)x(t) 


(4.8.30) 


(4.8.31) 


(4.8.32) 


We  can  then  write  for  the  closed-loop  control  system 
a second-order  difference  equation  in  [x(t+l),  x(t+l|t)]  =x(t+l) 
We  remark  that  this  "state"  representation  is  equivalent  to  a 
[x(t+l|t),  e(t+l|t)]  representation  if  we  define 


e( t+1 | t)  = x( t+1 | t)  - x ( t+1) 


(4.8.33) 


Then  we  have 


x(t  + l) 
x( t + 1 1 t ) 


a(t)  + b(t)g(t)h(t)c-(t)  b(t)  g(t)(l  - h(t)  c) 
(a  + bg(t))  h(t)  c(t)  (a  + bg(t))(l  -h(t)c) 


x(  t ) 

x(t|t-l) 


(4.8.34) 


We  write  the  above  as 


x ( t+1 ) ^ A(t)  x ( t ) 


(4.8.35) 


We  analyze  the  mean-square  stability  of  such  a second- 
order  system  by  examining  the  Lyapunov  function 


V(  x(  t ) ) = x'(t)  x ( t ) 


(4.8.36) 


Then  we  compute 

E{V(x(t+l))  - V(x(t))}  = x'(t)(A'(t)  A(t)  - I)x(t)  (4.8.37) 
Theorem  4.6.  The  solution  of  the  system  given  by  the  second- 


order  difference  equation  is  mean-square  stable  if  and  only  if 
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E{A'(t) A(t)} - I < 0 (4.8.38) 

or  that  the  maximum  eigenvalue  of  the  matrix  E{A'(t)  A(t)} 
has  to  be  less  than  unity  in  magnitude,  i.e.,  max | ( A^ , \0) | <1. 
Proof : See  [58] . 

Applying  this  fact  to  our  system,  then 


E(A '( t )A( t ) } 


a2  + 2ab  gh  c + b2  g2  h2  c2 
+ (a  + b g)2  h2  c2 

g(l  - h c)  (ab  + b2  ghc) 

+ (a+bg)2(l  - hc)hc 


( ab  + b2  gh  c ) g(  1 -h  c ) 
+ (a+bg)2hc(l-hc) 

72  2 , . -.2 

b g ( 1 - h c) 

+ (a  + bg)2  (1  - he)2 


(4.8.39) 

The  eigenvalues  of  this  symmetric  matrix  is  obtained 
by  solving  det  (A  - A I)  = 0.  We  are  free  to  choose  g and  h. 

After  some  algebraic  manipulations,  we  obtain  that  the  roots 
of  the  characteristic  equation  are  given  by 


- g ± Vg2  - 

, 2 2 

I f we  define 


4a 


(4.8.40) 


g 


a2  + 2ab  gh  c + b2  g2  h2  c2  + ( a + b g )2  h2  c2 


+ (l-hc)2(b2g2+(a  + bg)2) 


(4.8.41) 


and 


o 2 2 2 2 

a + 2ab  gh  c + b g h c 


+ (a  + b g)2 


J2  21  L 

he  b 


2 2 . - . 


— v 2 


g + ( a + b g)  (1  -he) 


(ab  + b ghc)g  + (a  +-  b g) 


?)2  h c 


"*  2 (1-hc)2  (4.8.42) 


r 


T 

S 


Thus , 


&2  - 4a 
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2 2 ~"2 

""2  2.2  2 + (a  + "g)  h 

,p,2SB-^bSh  ,12 


2 I2 

9 1 2.  (o  + bg)  M 

. <b  « +la  J _-\»ll.ho)*»" 

r ___  72  ehc)g  + ^ + bg)  h J (4.8.43) 

+ 4\(ab  + b gbc;s 

se  g and  h such  that  the 

The  pro*-  “ CW°S6  solutio„  o«  a «-«- 

i This  involves  necessary 

, u I \X9  ) <X'  resulting  from  th 

maxClH'  ’ ' 2 -n  a and  h,  resUi 

eauations  m g 

of  quartic  d 

conditions. 


3g 


and 


r 6 * VB2  ° ) (4.8.44) 

a r B + Ve2  -4“]  ’ ° 

3"  ion  is  algebraically  cumbe  efluationa 

The  ccputat^  ^ o(  Qutput  feedback 


g 


— i 

~~2  72 

b c 


(4.8.45) 


so 


that  Ed 


(4.8.47)  becomes 


-5  2 

b c 1 


E((A'(t 


) A(t)) 


-2  "2  / ab_c 

_ — abC-  + b 

a2  - 2ab  -2  -5 

b c \ o "9 

— -2\2  c2 

a-b^)  c2 


0 


2\2  c2 


(4.8.46) 
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The  nonzero  eigenvalue  is  thus  given  by 

x *2-S-- + 52 -if- 

vf  2 / c ,2  ,2  2 2 

be/  b b b c 


(4.8.47) 


Hence,  the  condition  that 

~2  ab2  c2  , n 
a - - ■ - — < 1 

v- 2 „2 

b c 

does  not  satisfy  the  necessary  conditions  in  Eq.  (4.8.38) 
It  is,  therefore,  not  the  optimal  values  of  g and  h. 


Conclusions 

We  summarize  the  main  results  in  this  subsection. 

It  is  shown  that  the  feedback  linear  control  using  output 
directly  gives  a sufficient  condition  for  the  mean-square 
stability  of  the  randomly-varying  dynamic  system.  By  analogy 
with  the  reasoning  in  Section  2.5,  the  optimum  gain  using 
output  feedback  obtained  from  the  stability  analysis  is  the 
true  limiting  gain  for  the  truly  optimal  stochastic  control 
law  in  the  unstable  region  (in  the  mean-square  sense). 

For  the  fixed  structure  feedback  control  system, 
we  then  give  the  necessary  conditions  for  the  optimal  gains 
and  implicitly  the  necessary  conditions  for  mean-square 
stability.  It  is  then  shown  that  the  optimum  gains  derived 
for  the  output  feedback  do  not  satisfy  the  necessary  condi- 
tions for  the  mean-square  stability  of  the  fixed  structure 


control  system. 
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4 . 9 Conclusions 

In  this  chapter  we  have  presented  the  results  for 
the  adaptive  stochastic  control  of  linear  systems  with  purely 
random  (white)  parameters.  The  system  state  cannot  be  mea- 
sured exactly.  The  measurement  data  is  computed  by  additive 
white  noise.  We  first  gave  the  optimum  control  law  in  terms 
of  the  conditional  means.  We  know  that  for  this  class  of 
non-1 inear-quadratic-gaussian  stochastic  control  problem,  the 
optimum  estimator  is  nonlinear  and  requires  computation  of 
all  the  moments.  Hence,  we  seek  adaptive  controllers  with  a 
given  fixed  structure.  The  class  of  admissible  controllers 
are  thus  restricted  to  be  linear  feedback  regulator  type. 

The  original  stochastic  system  is  then  transformed  into  a 
deterministic  system.  We  solved  the  dynamic  deterministic 
optimization  problem  first  using  the  Matrix  Minimum  Principle 
and  then  the  dynamic  programming.  With  the  structure  of  the 
dynamic  compensator  fixed,  we  subsequently  optimize  the  free 
parameters  of  the  compensators.  The  free  parameters  are  the 
linear  control  and  estimator  gains. 

In  the  resulting  time-varying  feedback  controller 
the  off-line  computational  requirements  seem  more  severe  than 
the  case  of  optimal  stochastic  controller.  To  obtain  the 
optimal  gains  we  have  to  solve  a coupled  nonlinear  two-point 
boundary  value  problem  involving  difference  equations.  This 
is  not  a trivial  computation  even  compared  to  solve  the  non- 
linear filtering  problem. 


f 


N 3 
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In  the  fixed  structure  dynamic  compensator,  the 
control  now  affects  both  the  mean  and  variance  of  the 
estimation  error.  This  is  an  example  of  cautious  control. 
This  is  contrasted  with  the  optimum  solution  obtain  by 
stochastic  dynamic  programming  where  the  minimum  variance 
of  the  state  estimate  is  independent  of  the  control.  In 
the  linear  minimum  variance  filter,  the  control  does  affect 
the  estimation  accuracy. 

For  the  first  time  in  the  literature,  the  asymp- 
totic behavior  of  the  linear  controller  for  stationary 
system  is  examined.  Taking  an  approach  analogous  to  that 
in  Section  2.5,  we  derived  a sufficient  condition  for  the 
existence  of  optimum  linear  feedback  controller.  We  also 
derived  a sufficient  condition  for  the  system  to  be  mean- 
square  unstabilizable  under  linear  feedback. 

In  Chapter  3,  we  obtained  the  result  that  the 

linear  discrete  filter  is  stable  if  the  second  moment  is 

bounded.  The  necessary  and  sufficient  condition  for  asymp- 

_2 

totic  stability  of  the  second  moment  is  that  the  a + £ < 1 

This  is  only  a sufficient  condition  in  the  fixed  structure 
optimal  control  problem.  As  indicated  by  the  stability 
region  (boundary)  curve  derived  from  computer  simulations, 
the  true  stability  curve  is  somewhere  between  that  given  by 
the  output  feedback  stability  analysis  in  Section  4.8  and 
the  uncertainty  threshold  for  the  exact  measurement  case  in 
Section  2.5. 
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We  have  shown  that  for  the  linear  dynamic  systems 
with  fixed  structure  feedback  controller,  there  exists  a 
threshold  determined  by  the  means  and  covariances  of  the 
randomly  varying  parameters  such  that  optimum  linear  control 
laws  for  the  infinite  horizon  problem  exist  if  and  only  if 
that  inequality  condition  is  satisfied. 


mmmmmmmmmm 


- . 
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CHAPTER  5 

ON  LINEAR  MULTIVARIABLE  CONTROL  SYSTEMS 


5 . 1 Introduction 

In  this  chapter,  we  shall  extend  the  analysis  in  Chapter 
2 to  linear  multivariable  control  systems.  We  will  consider 
the  exact  measurement  case  in  Section  5.2.  We  state  the  optimal 
stochastic  control  problem  with  purely  random  parameters;  the 
results  have  been  presented  in  1 35)  and  [37],  In  Section  5.3, 
we  consider  a special  case  of  the  problem  stated  in  Section 
5.2.  In  particular,  we  present  the  optimality  and  stability 
results  when  the  matrices  & and  B are  multiplied  by  some 
scalars,  sequentially  uncorrelated  in  time.  The  results  have 
appeared  in  1511  and  (791. 

In  Section  5.4,  we  consider  the  inexact  measurement 
case,  where  the  observations  are  corrupted  by  white  noise. 

The  solution  to  the  fixed  structure  estimator-controller  is 
given.  The  primary  motivation  for  this  chapter  is  to  indicate 
where  the  previous  results  apply  and  can  be  extended  readily, 
and  to  indicate  the  mathematical  notational  complexity  and 
computational  burden  required.  Basically,  no  new  theoretical 
results  are  presented  in  the  analysis. 
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5 . 2 Optimal  Control  of  Systems  with  Exact  Measurements 

Consider  a first-order  linear  dynamical  system  with  state 
vector  x(t)  and  control  u(t)  described  by  the  difference  equa- 
tion 

x(t+l)  = A(t)x(t)  + B(t)u(t)  + C(t)  (5.2.1) 

t = 0,1,2,... 

where  x(t)  is  an  n-dimensional  vector,  A(t)  is  an  n x n matrix, 
B(t)  is  an  n*m  matrix,  u(t)  is  an  m-dimensional  vector,  and 
£(t)  is  an  n-dimensional  white  noise  vector.  The  initial  state 
vector  x(0)  is  given. 

It  is  assumed  that  we  have  exact  measurement  of  the  state, 
Z(t)  = x(t)  (5.2.2) 

In  Eq.  (5.2.1),  it  is  assumed  that  the  system  parameters 
A(t)  and  B(t)  contain  purely  random  parameters  as  elements 
which  may  be  grouped  into  a random  parameter  vector  £(t).  It 
is  assumed  that  the  random  vector  £(t)  is  statistically  inde- 
pendent and  identically  distributed  in  time.  The  random  vectors 
selected  at  each  time  may  have  correlated  elements,  so  that  the 
off-diagonal  elements  of  the  covariance  matrix  of  £(t)  are  non- 
zero. To  be  more  precise,  we  assume  that  for  wide-sense  station- 
ary parameters  in  A(t)  and  B(t), 

E{£(t)}  = £ (5.2.3) 

E { ( £(  t ) - £)  (£(t)  - £)’>  = Ip<5(t,T)  (5.2.4) 


where  6(t,t)  is  the  Kronecker  delta  function. 
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However,  to  be  able  to  write  down  mathematically  the 
ensuing  results  for  the  multivariable  system,  we  will  soon  need 
some  machinery  from  tensor  analysis,  since  the  covariance  of 
A(t)  is  a fourth-order  tensor  of  n4  components.  Alternative- 
ly, because  of  our  particular  formulation  of  the  white  para- 
meter problem,  the  notational  complexity  is  lessened.  We  will 
need  the  relationship  following  [80] 

E{x' (t+l)£  x(t+l)}  ^ x' (t+l)Q  x(t+l)  + tr  § Ex(t+1) 

(5.2.5) 

X L.  / 4-  1 I \ 

where  the  ijLn  element  of  the  matrix  v ’ is  given  by 
using  Eq.  (5.2.1) 

x(t+l)  A • A ^ AiBi 

= x'(t)  z J x(t)  + 2x'(t)  £ J u(t) 

BiBj  ^i^i 

+ u ' ( t ) I J u(t)  + Z 3 (5.2.6) 

AiB- 

where  £ Jis  the  covariance  matrix  of  the  i't*1  row  of  £ with 
the  j**1  row  of  B. 

Finally,  we  remark  that  the  additive  noise  C_( t ) is  assumed 
to  be  zero-mean  Gaussian  white,  and  independent  of  (A(t))  and 
(B(t))  *. 

The  control  problem  is  the  stochastic  regulator  type 
optimization  problem  with  the  expected  cost  function  given  by 

* If  £(t)  is  a colored  noise,  then  we  can  always  generate  it 

with  a pre-whitening  filter.  If  f^(t)  is  correlated  with  A(t) 
and  B( t ) , then  expressions  in  Eq.  (5.2.6)  are  appropriately 
changed. 
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N-l 

J = E(x’(N)F  x(N)  + l x'(t)  Q(t)  x(t)  + u ' ( t )R( t )u( t ) } 

t=0 

(5.2.7) 

where  Q( t ) , R(t),  and  F are  symmetric  positive  semi-definite 
matrices.  The  quadratic  cost  functional  in  Eq.  (5.2.7)  assigns 
a real  member  to  the  pair  vector  x(t)  and  u(t). 

The  set  of  admissible  controls  u(t)  e U(t)  where  U(t)  is 
a subset  of  the  m-dimensional  Euclidean  space.  Since  we  are 
interested  in  closed-loop  controls,  the  admissible  controls 
u(t)  are  assumed  to  depend  only  on  the  a priori  given  informa- 
tion and 

Y1  = {£(0),y_(l),  JL(t)>  and  Ut_1=  {u(0),u(l), 

. . . . , u(t  -1)} 

The  stochastic  control  problem  is  to  find  a control  se- 
quence (u(0),  u(l) , ....  u(N-l)}  such  that  it  minimizes  the 
expression  in  Eq.  (5.2.7).  The  solution  is  given  by  the  sto- 
chastic dynamic  programming  algorithm. 

The  optimal  control  law  is  given  by+ 

u*(t)  = - G*(t)x*(t)  (5.2.8) 


then  tr  Ql*3®  = qn  1 


BlBi 
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■ -[£' 


G*(t)  = - 1 R(t ) + B'  K*( t+1  )B 


BB  1 

B + tr  K(t+1)£ 


{B ’ K*(t+1)A  + (tr  K* ( t+1 ) Z )'} 


(5.2.9) 


where  the  Riccati-like  matrix  difference  equation  is 


K*(  t ) = £(t)  + A'  K*(t+1)A  + tr  K*(  t+1 ) I 


“ S* ( t ) (B'K*(t+l)B  + (tr  K*(t+1)  Z*3  )'  ) (5.2.10) 

K*(N)  = F 


S*(t)  = (A'K*(t+l)  B + (tr  K*( t+1 ) Z^)  jg.(t) 


+ B' K*( t+1 ) B + tr  K*( t+1 ) 


BBl  -1 

* J 


(5.2.11) 


The  recursive  functional  equation  is  thus 


V(N)  = x'*(N)  F x*(N) 


(5.2.12) 


V(t , x*( t ) ) = x ( t )K( t )x* ( t ) + l K*( t+1 ) 5 (5.2.13) 

T=t 

From  the  results  for  scalar  system  analyzed  in  Chapter  2, 
we  know  that  the  convergence  of  the  sequence  of  { K( t ) } generated 
by  the  Riccati-like  equations  (5.2.10)  and  (5.2.11)  must  satis- 
fy some  inequality  condition  on  the  a priori  means  and  covar- 
iances of  the  randomly  varying  parameters.  The  steady-state 
solution  K then  satisfies  the  so-called  algebraic  Riccati  equa- 
tion, and  the  control  law  has  linear  constant  gains  in  the 


steady-state  interval. 

The  limiting  gain  for  the  closed-loop  control  system 
exists  even  if  the  Riccati  solutions  diverge,  as  shown  in  the 
scalar  system  examined  in  Chapter  2.  The  gain  in  the  limit 
is  obtained  from  Eq.  (5.2.9).  Alternatively,  the  gain  can  be 
derived  by  considering  the  mean-square  stability  of  a stochastic 
system  Eq.  (5.2.1),  under  linear  feedback  as  demonstrated  in 
Section  2.5.  In  any  case,  the  analysis  will  give  the  stabil- 
ity condition  for  the  closed-loop  stochastic  control  system. 

5 . 3 Linear  Multivariable  Control  for  Systems  with 

Scalar  Random  Parameters 

In  this  section,  we  will  consider  a special  case  of  the 
linear  multivariable  system  formulated  in  Eq.  (5.2.1).  In  par- 
ticular, instead  of  the  random  matrices  we  have  to  deal  with  in 
Eqs . (5.2.1)  and  (5.2.2),  we  replace  the  randomness  by  a random 
scalar  multiplying  the  matrices  A(t)  and  B(t).  So  the  notations 
and  symbols  involved  in  the  solutions  are  that  much  less  cumber- 
some. The  results  given  in  this  section  are  also  found  in  151] 
and  generalized  in  [79] . 

Consider  then  the  linear  discrete-time  stochastic  system 
whose  dynamics  are  described  by  the  vector  difference  equation 

x(t+l)  = y( t ) A x ( t ) + 6(t)  B u(t)  + £(t)  (5.3.1) 

Both  the  system  matrix  A and  the  control  matrix  B are  multiplied 
by  white,  possibly  correlated,  scalar  random  sequences.  We 
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assume  that  A and  B are  constant  matrices  of  appropriate  di- 
mensions without  loss  of  generality,  since  the  product  (y(t)A) 
is  time- varying.  The  additive  noise  £(t)  is  a zero-mean  Gaus- 
sian white  noise.  Assume  that  [A,B]  is  a controllable  pair 
and  that  B is  n * n and  of  full  rank. 

We  further  assume  that  the  scalars  y(t)  and  6(t)  are  Gaus- 
sian white  random  sequences  with  known  stationary  statistics. 
More  precisely,  we  have: 

E{y(t)}  = 7,  E{(y(t)-  7)(y(t)  - 7)  } =rfi(t.x) 

(5.3.2) 

E( 5(t ) } = 6 , E{ ( 6 ( t ) - 6)  ( 6 ( t ) - 6)}  = AS(t,x)  (5.3.3) 

E{( y(t)  - 7)  (6(0  - 6)}  = A6 ( t , t ) (5.3.4) 

E (£(t)}  = 0 , E{£(t)|_'(x)>  = H 6 ( t , t ) (5.3.5) 

where  5(t,x)  is  the  Kronecker  delta.  Furthermore,  we  assume 
that  the  plant  noise  £(t)  is  mutually  independent  of  the  scalar 
random  sequences  y(t)  and  5 ( t ) . 

We  have  the  standard  quadratic  cost  function  (5.2.7)  we 
want  to  minimize.  Assume  that  [A  , ) is  an  observable  pair. 

Under  the  assumption  that  we  can  measure  the  entire  state  vector 
x(  t ) exactly,  at  each  instant  of  time,  we  wish  then  to  find  the 
feedback  optimal  control  sequence  u(0),  u( 1 ) , u(2),  ...  such 
that  the  quadratic  cost  (5.2.7)  is  minimized. 

The  problem  can  be  readily  solved  using  the  dynamic  pro- 
gramming algorithm  as  in  Section  5.2.  The  optimal  control  is 
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in  linear  state  variable  feedback  form, 

u*(t)  = - G*(t)  x*(t)  (5.3.7) 

where  the  optimal  feedback  gain  is  given  by 

-1 

G(t)  = [R  + (62  +A)B'K(t+l)  B ] (y  6 +A)B'K(t+l)A  (5.3.8) 

The  n * n matrix  K(t)  satisfies  a recursive  matrix  equa- 
tion of  the  form 

K( t ) = (72  + r)A'K(t+l)A  + Q - 

(y5  + A)2A'K(t+l)B[R  + ( 6?+ A )B' K( t+1 )B ] ~1B' K( t+1 )A 

(5.3.9) 

K(N)  = 0 

We  remark  that  the  matrix  Riccati-like  equation  (5.3.9)  cannot 

be  related  to  a coupled  set  of  linear  equations,  however. 

Therefore,  it  will  be  referred  to  as  the  "UTP  matrix"  equation. 

Under  our  assumptions,  the  solution  to  the  UTP  matrix  equation 

(5.3.9)  exists  and  is  positive  definite  and  bounded  for  all 

finite  planning  horizon  times,  N.  The  average  optimal  cost  is 

given  by  N 

J*(x(0),N)  = x’ (0)  K( 0 )x( 0 ) + tr  £ K(t)  H (5.3.10) 

t=0 

For  the  infinite  horizon  case  as  N + »,  we  are  interested 


t 

• » 


I 


in  examining  the  existence  of  an  optimal  solution  and  the 
stabilization  of  the  stochastic  system  Eq.  (5.2.1).  We  prove 
the  following  theorem. 
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Theorem  5 . 1 (Uncertainly  Threshold  Principle) 

An  optimal  solution  exists  for  the  problem  given  by  Eqs. 

(5.3.1)  to  (5.3.6),  as  N -*■  « if  and  only  if 

max  I A (A) | < — i = 1,2,  n (5.3.11) 

i B 

where  B is  defined  by 

B = y2  + r - — AI  >_  o (5.3.12) 

62  + A 

and  max  |x^,(A)|  denotes  the  magnitude  of  the  maximum  eigen- 
value of  the  constant  system  matrix  A in  Eq.  (5.3.1). 

Before  we  present  the  proof  of  the  theorem,  it  is  impor- 
tant to  make  some  remarks. 

Remark  1.  In  the  case  of  non-random  parameters  (r  = A = A 

= 0),  6=0,  this  means  that  given  our  assumptions  of  the 

x * 

pairs  [A,B]  controllability  and  [A,  Q2]  observability,  one 

can  always  solve  the  infinite  horizon  optimal  control  problem 

independent  of  the  (open-loop)  eigenvalues  of  A.  On  the 

other  hand,  as  the  variances  r and  A of  the  random  parameters 

r 

increase,  then  6 increases  and  the  value  of  1/6  defines  the  • ; 

7 I 

radius  of  a shrinking  disc  which  must  contain  all  the  open-  } 

loop  eigenvalues  of  A in  order  for  the  problem  to  have  a 
solution . 

Remark  2.  If  the  condition  in  Eq . (5.3.11)  is  violated, 
i . e . , if 

max  |xi(A)|  >_  j (5.3.13) 


I 1 
1 
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then  there  is  no  solution  to  the  optimal  control  problem, 
and  one  cannot  stabilize  (in  the  mean-square  sense)  the  system 
of  Eq.  (5.2.1).  Under  these  conditions,  (5.3.13),  the  optimal 
cost  in  (5.3.10),  undergoes  exponential  growth  as  N increases, 
so  that 


max | X . ( BA) | N 
J*(N)  >.  c e i 1 


c = constant  (5.3.14) 


Because  of  the  explosive  growth  of  the  optimal  cost  in  (5.3.14) 
then  only  the  short-term  (small  N)  control  makes  sense;  see 
also  Section  2.4. 

As  in  the  scalar  system  in  Section  2.4,  even  if  condition 
Eq.  (5.3.12)  holds,  the  control  gain  matrix  G(t)  in  (5.3.8) 
remains  well-behaved  and  is  bounded,  so  the  limiting  gain 

G = lim  il-A  +JLi  [b ' K( t+1 )B  J B'K(t+l)A  (5.3.15) 
N-*®  6 ^ + A 

Next,  we  present  the  details  of  proving  Theorem  5.1.  We 

remark  that  the  proof  essentially  uses  algebraic  manipulations 

and  well  known  properties  of  the  discrete  Lyapunov  and  Riccati 

matrix  equations.  The  main  idea  of  the  proof  is  to  examine 

the  behavior  of  lim  K(t)  or  the  behavior  "backward  in  time” 
N-*-“ 

of  the  UTP  matrix  equation  (5.3.9).  The  arguments  are  simi- 
lar to  that  used  in  [51], 

Proof : For  the  sake  of  notational  convenience,  define 


I 


I 

| 


the  scalars 


" : 


- .. 


I Hi 
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a = Y2  + T , a = (y  <5  + A)  , a - ~ 

1 2 3 J2  + A 

(5.3.16) 

The  UTP  matrix  equation  (5.3.9)  can  then  be  written  as 

K(t)  = a A' K( t+1) A + Q - a A'K(t+l)B  [R  + — B ’ K( t+1 )B] _1 

— i — 2 — — “3 

B ' K( t+1 )A  (5.3.17) 

From  Eqs.  (5.3.12)  and  (5.3.16),  we  obtain  that 

g2  = o1  - a2a3  (5.3.18) 

By  adding  and  subtracting  «2 « 3A ' K( t+1 )A  to  the  right-hand 
side  of  (5.3.17),  and  after  some  algebraic  manipulations,  Eq. 
(5.3.17)  reduces  to 

K(t)  = 82 A ' K( t_l ) A + Q + a2a3 A* { K(t+1)  - K(t+1)B 

[a  R + B ' K( t+1 )B] -1  B ' K( t+1 ) } A (5.3.19) 

3 

Attention  is  focused  on  the  matrix  we  now  define: 

* 

M( t+1 ) ^ K( t+1 ) - K( t+1 ) B [a  R + B’K(t+l)  B] -1  B'K(t+l) 

— ~ ~ “ 3 

(5.3.20) 

Such  matrices  arise  naturally  in  the  matrix  Riccati  equation 
of  standard  linear-quadratic  problems  where  the  control  weight- 
ing matrix  is  c^R  [81J  . Under  the  given  assumptions  of  [A,B] 
controllability  and  [A,Q“  ] observability,  it  is  well  known 


■ 


-T. 
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[81],  [82]  that 

M(  t + 1 ) = M'(t  + 1)  > 0 (5.3.21) 

and  there  exists  a bound 

L > M(t)  , for  all  t (5.3.22) 

Since  M(t+1)  is  positive  definite,  so  is  a2a3  A'M(t+l)A. 

Hence,  we  readily  obtain 

K(t)  > B2  A' K( t+1 ) A + g (5.3.23) 

From  Eq.  (5.3.23)  it  is  obvious  that  if  any  eigenvalue 

of  (BA)  is  greater  than  unity,  then  K(t)  grows  without  any 

bound  backward  in  time,  lim  K(t)  does  not  exist,  and  the 

N->-<x> 

optimal  cost  undergoes  exponential  growth  as  given  by 

Eq.  (5.3.14). 

On  the  other  hand,  from  (5.3.22)  and  (5.3.23),  we  obtain 

that 

K(t)  < B2A'K(t+l)A  + q + «2a3  A ' L A (5.3.24) 

Hence,  if  all  the  eigenvalues  of  (BA)  are  less  than  unity, 
the  right-hand  side  of  the  recursion  Eq.  (5.3.24)  will  approach 
a bounded  constant  solution  matrix,  and  so  will  K(t).  The 
limiting  solution  lim  K(t)  is  well  defined. 

N-+-00 

We  make  an  important  remark  that  the  proof  requires  that 
B matrix  is  n * n and  nonsingular,  as  required  in  the  corollary 
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of  [51]  . However,  we  believe  that  this  is  a sufficient, 

but  by  no  means  a necessary,  condition. 

5 . 4 Optimal  Control  for  Systems  with  Inexact  Measurements 

In  this  section,  we  shall  consider  the  optimal  stochastic 
control  of  linear  dynamical  systems  with  purely  random  para- 
meters and  imperfect  measurements.  More  precisely,  we  have  the 
same  linear  dynamical  system  as  in  Eq.  (5.2.1),  but  the  measure- 
ment data  are  now  assumed  to  be  corrupted  by  additive  white 
noise , i . e . , 


z(t)  = C(t)  x(t)  + 0(t) 


(5.4.1) 


where  _0(t)  is  the  zero-mean  Gaussian  white  noise  vector,  and 
C( t ) is  assumed  to  contain  elements  that  are  randomly  varying. 

This  general  case  has  been  considered  in  [37] . The 
cost  functional  we  want  to  minimize  is  that  given  by  Eq.(5.2.7) 
Using  dynamic  programming  algorithm  , the  optimal  control  at 


t = N-l  is  given  by 


u*(N-l ) = - G(N-l)  x(N-l/N-l) 


(5.4.2) 


where  x(N-l/N-l)  is  the  conditional  mean  of  x(N-l)  given 
the  past  measurements  up  to  time  N-l  under  controls  up  to  N-2, 


— X 

G(N-l)  = [R(N-l)  + B' (N-l)F  B(N-l)]  BK N-l )F  A (N-l)  (5.4.3) 

We  note  that  in  computing  the  optimal  control  at  t = N-2, 


r 
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it  is  important  to  have  the  estimation  error  have  a conditional 
covariance  matrix  E (N-l/N-1)  be  independent  of  x(N-l)  and 

XX 

N- 1 

z = { z(0),  . ..,  z^(N-l)  }.  If  this  is  true,  then  the  co- 

variance  will  be  independent  of  the  past  controls.  In  linear- 
quadratic-Gaussian  problems,  the  linearity  of  both  the  system 
and  measurement  equations  is  sufficient  for  the  conditional 
covariance  to  be  independent  of  past  controls.  Under  these 
assumptions,  then  the  optimal  control  at  time  t is  given  by 


u*(t)  = - G(t)  x(t/t) 


(5.4.4) 


where 


>i-l 


G(t)  = [R(t)  + B ' ( t ) K(t+l)B(t)]  B’(t)K(t+l)A(t) 

(5.4.5) 

and  the  Riccati-like  matrix  difference  equation  is  given  by 


K(t)  = A 1 ( t )K( t+1 ) A(t)  + Q(t)  - 

A ' ( t ) K(t)B(t)  jjR(t)  + B' (t)K(t+l)B(t)] 


-1 


B ' ( t ) K( t + 1 ) A( t ) 


K(N)  = F 


(5.4.6) 


Note  that  the  optimal  gain  G(t)  in  Eq . (5.4.5)  is  not  random. 
The  optimal  cost-to-go  expression  is  then  given  by 


' r — — - 
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The  conditional  mean  is  not  computable  in  closed  form,  since 
the  truly  optimum  filter  is  infinite-dimensional.*  However, 
analogous  to  the  development  in  Section  4.3,  we  can  restrict 
our  attention  to  a fixed  structure  dynamic  compensator,  where 
we  cascade  a linear  filter  with  a linear  controller.  And  we 
reformulate  the  original  stochastic  control  problem  into  a 
deterministic  parameter  optimization  problem,  using  only  the 
first  and  second  unconditional  moments. 

Fixed-Structure  Linear  Controller 

Suppose  the  linear  multivariable  dynamic  system  is  des- 
cribed by  the  vector  difference  equation 

x(t+l)  = A(t)x(t)  + B(t)u(t)  + £_(t)  (5.4.8) 

t = 0,1,2,... 

where  x(t)  is  the  n-dimensional  state  vector  in  Rn, 
u(t)  is  the  m-dimensional  control  vector  in  Rn 
£(t)  is  the  zero-mean  white  Gaussian  noise  vector, 
with  covariance  matrix  5 , 


*Aoki's  book  contains  an  error  in  using  the  Kalman  filter. 


x(0)  is  a random  vector  with  mean  x and  covariance  Z 

— — O — X 0 

The  measurement  data  are  given  by 

z(t)  = C(t )x( t ) + £(t)  (5.4.9) 

z(t)  is  the  actual  r-dimensional  sensor  measurement 

vector  in  Rn 

9_(  t ) is  the  r-dimensional  zero-mean  white  Gaussian  noise 

vector  with  covariance  matrix  0 . 

The  matrices  A(t),  B(t)  and  C(t)  in  Eqs.  (5.4.8)  and  (5.4.9) 
contain  elements  that  are  uncertain.  We  assume  that  the  unknown 
parameters  are  purely  random  processes.  We  also  assume  that 
their  structure  is  known. 

In  the  case  of  stochastic  regulators,  we  define  the 
scalar  index  of  performance  by  a quadratic  cost  functional  of 
the  form 

N-l 

J = E{ x ' (N)F  x(N)  + l x<t)Q  x ( t ) + u'(t)R  u( t ) } (5.4.10) 

t=0 

where  F,  Q,  R >_  0.  The  optimal  stochastic  control  problem  is 
to  minimize  the  cost  functional  in  Eq.  (5.4.10)  subject  to  the 
dynamic  system  constraints  Eqs.  (5.4.8)  and  (5.4.9). 

We  fix  the  structure  of  the  optimal  stochastic  controller 


or  compensator  to  be  considered.  The  optimal  stochastic  con- 
trol at  each  constant  of  time  is  to  be  generated  by  time-vary- 
ing control, 
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u(t)  = - G(t)  x(t)  , (5.4.11) 

x(t)  c Rn  (n  arbitrary,  but  finite). 

The  quantity  x(t)  is  the  state  estimate  of  the  true 
state  vector  x(t)  and  is  to  be  generated  by  the  linear  unbiased 
est imator 

x(t)  = ( I - H(t)  C(t)j  ^A(t-l)  - l(t-l)G(t-l))  x(t-l) 

+ H ( t ) z(t)  (5.4.12) 

x(0)  = x 


From  the  results  presented  in  Section  4.4,  we  can  write 
down  the  recursive  equations  for  the  propagation  of  the  second 
moment  matrices. + 


M (t) 

— oo 


M 

-oi 


B G(t-1)^M  ( t-1 )(  A-B  G(t-l)V+  B G(t-l)M(t-l) 

— — /-oo  j ~ ~D1 

BG(t-l)]’  + |a-B  G(t-l))  Moi'(t-l)G'(t-l)|' 

+ I G(t-l)M11(t-l)G'(t-l)  I'  + 

+ tr  f |o(t-D  - 

(t-1)  - Mtl'(t-1)  t 

(5.4.13) 


ttr  IAA  I = 5lt  IAlAl+C,,  IAlAz- 


1 1 
A . A 


I*2*1  ♦ { 


a2a0 

7 ‘ where 

2 2 _ 


y * ^ = covariance  of  the  i^1  column  of  A and 
column  of  A. 
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MoI(t)  = - (l  - H(  t )C(  t )j£  A -(  Mq  ( (t-1  )(a-B  G(  t-1 )')  '+  Mu(t-1) 

xG'(t-l)  b’)  - tr  );AA  Mo)(t-l)-  S - tr 

G(t-l)  fM  (t-1)  - M (t-1)-  M’  (t-1)  + M ( t-1 ) ) 

\ 0 0 — 01  — 01  11  / 

G’(t-l)  J 

(5.4.14) 

Mn(t)  = (i  - H(t)C(t)]  [AMn(t-l)A’  + tr  £AA  M00( t-1 )+  3 

+ tr  £BB  G (t-1)  ( M (t-l)  - M (t-1)  - M'  (t-1) 

_ — \ — 0 0 — o 1 ~0  1 

+ Mn(t-l))G'(t-l)  J 

(I  - H(t)C(t)  ) ' + H(t)  ftr  IC°Moo(t)+  o)H'(t) 


(5.4.15) 


The  cost  function  to  be  minimized  becomes: 

tr[  q M (t)]  + 

— o o J 

K„(t)  - ^.<l)  - «ol(t>  + *u(tf] 

(5.4.16) 

The  original  problem  has  now  been  reformulated  as  a minimiza- 
tion over  the  elements  of  the  controller  matrices  G(t)  and 
H(t). 


N-l 


J = tr  [F  Moo(N)]  + l 
tr  |ja ' ( t ) R G 


t=0 

G(t) 
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The  deterministic  optimization  problem  is  then  given 
the  constraint  equations  (5.4.13)  to  (5.4.15)  and  initial 
conditions 


M(0) 


z 

— xo 


(5.4.17) 


and  the  cost  functional,  Eq.  (5.4.16),  find  the  controller 
matrices  G*(t)  and  H*(t)  such  that  J is  minimized. 

The  optimization  problem  can  be  solved  using  the  Matrix 
Minimum  Principle  or  Dynamic  Programming  algorithm.  The  re- 
sults are  summarized  in  the  following  theorem.  The  proof  is 
similar  to  that  given  for  the  scalar  system,  and  hence  will 
not  be  repeated. 


Theorem  5.2 

The  optimal  gain  matrices  to  the  deterministic  optimiza- 
tion problem  formulated  in  Eqs.  (5.4.13)  to  (5.4.17)  are  given 
by 


z 

— XX 


(t/t-1) 


c-  [c  I 


XX 


(t/t-1)  C' 


+ 


H*(t) 


G*( t-1  ) 


VVlU'IV 


= |u’  |tr  H 1 ( t ) P(t)H(t)  l + K(t))p; 

+ H + tr  1 1 r H ' ( t )P(  t )H(  t ) £ + (-  " 

P(t)  (l  - H(t)  c)  + K(  t )j  [BB  J §’ 

__ 

^tr  H'(t)P(t.)H(t  ) l + K(t)J  A (5.4.1 


(5.4.19) 


K(t)  = A’  (tr  H ' ( t + 1 )P( t + 1 )H( t + 1 ) [CC  + K(t  + 1)^A 

+ Q - A'  ^tr  H 1 ( t + 1 )P( t + 1 )H( t + 1 ) lCC  + K( t + 1 ) j 
I [P  (tr  H ' ( t+1  )P(  t + 1 )H(  t+1 ) fC  + K(t  + l)jl  + R 

+ tr  r Mt  + l)P(t  + l)!l(t+l)  lCC  + (l-H(t+l)  c)’ 
P(t  + 1)  (l-H(t+l)  c)  + K(t  + l)j  X88]"1  l' 

tr  H*  ( t + t )P(t  + l)H(t  + i ) £CC  + K(  t + 1 )^  A + 
tr ^ tr  H ' ( t+1 )P( t + 1 )H( t+1 ) £CC  + K( t + 1 ) + 

^ I_-H(  t + 1 ) cj  P(t  + 1)  U - H(  t + 1 ) Cjj  [AA 


K(N)  = F 


(5.4.20) 
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P(t)  = A*  ^1_  - H(  t+1  )C  j ' P(t+1)  (l-H(t  + l)  cj  A 

+ a'  ( tr  H ' ( t+1  )P(  t+1) H( t+1)  fC  + K(t  + 1)^B 
jjl’^tr  H ' ( t+1  )P(  t+1  )H(  t+1 ) l^C  + K(  t+1  B + R 

|tr  H'(t  + l)P(t+l)H(t+l)  £CC  + ^-H(t  + 1)  c)’ 


+ tr 


P(t  + 1)  (l-H(t  + l)  cj  + K(t+1)^  £BB "j  "V 
^tr  H'  (t  + 1 )P(  t + 1 )H(  t+1)  fC  + K(  t + 1 A 


P(N) 


0 


(5.4.21) 


where  we  also  identify  X(t)  = MQ0(t)  and  JLxx(t/t)  = M^Ct). 


-xx 


■l  l 


It  can  be  shown  that  M 1 1 ( t. ) = MQ1(t),  hence  the  state 
second  moment  is  given  by 


X(t)  = Ja  - B G(t-l))  X(t-l)  | A-B  G(t-l)j  ' + B G(t-l) 

Mj  i(t— 1)  |a-B  G(t+l)|'  + (a-B  G(t+l)j  Mj  j ( t — 1 ) 

x G’(t-l)  B'  + 5 + BG(t-l)  Mn(t-1)  G'(t-l)  B'  + 


tr  yAA  X(t-l)  + tr  £BB  G(t-l)  [M  (t-1)- 
— _ — 0 0 


Mn(t-l)l  G’(t-l) 


(5.4.22) 


■< 
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The  optimal  cost  is  given  by 
J = tr  |jC(0)X(0)  + P(0)  Ixx(0/0)1  + 


tr^  l K( t+1 ) £ + P(t+1)  (l-H(t+l)cj  5 (l-H(t+l)c)' 


+ P(t+l)H(t+l)  tr  H £CC  H ' ( t+1 ) + 


P(t+1)  H( t+1 ) 0 H ' ( t+1 ) 


(5.4.23) 


To  obtain  the  optimal  gain  matrices,  we  have  to  solve 
a coupled  nonlinear  two-point  boundary  value  problem  which 
involves  matrix  difference  equations  (5.4.20)  - (5.4.22)  and 
(5.4.15).  Hence,  there  is  no  separation  between  control  and 
filter  equations.  Numerical  solutions  to  the  TPBVP  can  be 
obtained  by  using  the  successive  approximation  method. 


f ! 
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5 . 5 Conclusions 

For  algebraic  simplicity,  we  have  thus  far  restricted 
our  consideration  to  the  case  of  scalar  linear  control  systems 
only.  An  extension  to  the  linear  multivariable  systems  with 
purely  random  parameters  can  be  made  in  a straightforward 
manner.  The  results  of  Sections  2.2  and  2.3  are  extended  in 

N 

Sections  5.2  and  5.3  for  a particular  class  of  problems  in 
which  the  random  parameters  are  scalar  variables.  The  results 
of  Sections  4.2  and  4.3  to  4.4  are  extended  in  Section  5.4  to 
linear  multivariable  control  systems. 

For  the  special  case  of  a multivariable  linear  system,  we 
derived  a threshold  condition  involving  the  maximum  eigenvalue 
of  the  system  matrix  A and  the  means,  variances,  and  cross- 
correlations of  the  purely  random  parameters.  If  the  threshold 
condition  is  violated,  then  there  does  not  exist  an  optimal 
solution  to  the  infinite  horizon  problem.  The  linear  multi- 
variable  system  is  then  not  mean-square  stabilizable  under 
linear  feedback. 

In  Section  5.4,  we  presented  the  form  of  solutions  to  the 
linear  multivariable  control  systems  with  fixed  structure  feed- 
back regulator  to  control  the  dynamical  system.  The  specific 
notation  used  readily  degenerates  to  the  standard  linear-quad- 
ratic-Gaussian  solutions.  Other  possible  notations  involve 
tensors  and  Kronecker  products  (direct  products).  The  com- 
plexity of  the  matrix  difference  equations  would  even  make  the 
computer  simulations  a nontrivial  problem. 


CHAPTER  6 


SUMMARY  AND  CONCLUSIONS 

6 . 1 Summary  of  the  Main  Results 

In  this  research,  our  objective  has  been  to  investi- 
gate the  optimal  stochastic  control  of  linear  dynamical 
systems  with  purely  (white)  parameters.  The  uncertain  para- 
meters are  thus  uncorrelated  in  time.  The  white  parameter 
approach  to  adaptive  stochastic  control  is  important  because 
it  shows  (in  a worst  case  sense)  the  fact  that  the  control 
gains  of  an  optimal  stochastic  system  with  purely  random 
parameters  depend  not  only  upon  the  mean  values,  but  also 
upon  the  variances  of  the  random  parameters.  The  solution 
of  this  class  of  problems  illustrates  how  the  effects  of 
model  parameter  uncertainty,  as  quantified  by  the  parameter 
variances,  modulate  the  control  gains,  thus  introducing  the 
notion  of  "hedging"  in  the  presence  of  dynamic  uncertainty. 

In  Chapter  2 we  analyzed  the  adaptive  stochastic 
control  of  uncertain  systems  with  exact  measurements  of  the 
state.  We  obtained  the  time-varying  linear  feedback  control 
law.  We  then  investigated  the  existence  of  optimal  control 
law  for  the  infinite  horizon  problem.  The  result  is  known 
as  the  Uncertainty  Threshold  Principle.  The  solution  to  the 
discounted  cost  problem  further  emphasizes  the  issue  of 
optimality  versus  stability  in  adaptive  stochastic  control 
problem.  Optimality  is  based  on  Bellman's  Principle  of 


Optimality  or  Pontryagin's  Maximum  Principle.  Stability  of 
stochastic  systems  under  feedback  is  an  extension  of  the 
Lyapunov  stability  concept  for  deterministic  systems.  Both 
the  almost  sure  (pointwise)  stability  and  mean-square 
stability  criteria  are  obtained  for  the  perfect  measurement 
system. 

In  Chapter  3 we  analyzed  the  optimal  stochastic 
estimation  of  linear  systems  with  randomly  varying  parameters. 
Since  the  optimal  estimator  is  nonlinear  and  infinite  dimen- 
sional, we  derived  the  optimal  linear  minimum  variance  un- 
biased filter.  The  optimal  linear  estimator  turns  out  not 
to  be  the  dual  of  the  optimal  control  problem  considered  in 
Chapter  2.  We  note  that  the  optimal  solution  derived  in 
Chapter  2 is  the  truly  optimal  control  law,  whereas  the  opti- 
mal state  reconstructor  in  Chapter  3 is  only  the  linear  min- 
imum variance  estimator. 

In  Chapter  4 we  considered  the  optimal  control  of 
linear  systems  with  purely  random  parameters  and  noisy  sensor 
measurement  data.  Hence,  we  need  to  solve  simultaneously 
the  optimal  stochastic  estimation  and  the  optimal  stochastic 
control  problems.  The  optimal  controller  must  reduce  the 
uncertainty  in  the  state  and  regulate  the  process.  In  the 
case  of  stochastic  systems  with  uncertain  parameters,  "good" 
knowledge  of  the  future  values  of  the  state  is  not  available. 
Our  approach  is  to  let  the  mathematical  formulation  of  the 
problem  handle  the  complex  tradeoff  between  good  identifica- 
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tion  and  good  control,  and  provide  the  optimal  solution  con- 
taining the  appropriate  strategy  for  optimizing  a performance 
index  as  a function  of  time.  Since  the  problem  is  a non- 
linear stochastic  control  problem,  we  then  consider  the  de- 
sign of  the  control  structure  composed  of  a linear  controller 
and  a linear  estimator.  We  thus  transform  the  original  sto- 
chastic control  problem  into  a deterministic  parameter 
optimization  problem.  We  jointly  optimize  the  control  and 
filter  gains  to  minimize  the  expected  value  of  the  quadratic 
performance  index. 

The  solution  to  the  deterministic  optimization 
problem  can  be  obtained  using  the  Matrix  Minimum  Principle 
or  the  dynamic  programming  method.  We  then  considered  the 
infinite  horizon  problem,  and  found  the  stability  region  for 

i 

a particular  set  of  parameter  means  and  variances  through 
the  computer  simulations  of  the  two-point  boundary  value 
problem.  We  carried  out  the  mean-square  stability  analysis 
using  the  direct  output  feedback,  and  obtained  the  sufficient 

r- 

condition  for  stability.  ; 

! 

In  Chapter  5 the  results  obtained  for  the  scalar  ‘ } 

| 

systems  are  then  generalized  to  linear  multivariable  systems. 

The  notations  quickly  become  cumbersome.  We  then  considered 
a special  class  of  linear  multivariable  control  systems  where 
the  constant  system  matrices  are  multiplied  by  scalar  random 
variables,  and  derived  stability  criteria  for  such  a system. 

We  also  indicated  the  form  of  solution  to  the  optimal  control 


I 


problem  of  systems  with  noisy  sensor  measurements  employing 
the  design  bu.sed  upon  the  decomposition  of  the  control 
structure  into  a linear  control  and  linear  estimator  of  fix- 


ed finite  dimension. 

6 . 2 Conclusions 

We  have  shown  in  this  thesis  that  for  dynamic  sys- 
tems with  known  structure,  but  randomly  varying  parameters 

■ 

(modelled  as  white  noise),  the  Uncertainty  Threshold  Princi- 
ple states  that  optimal  infinite  horizon  control  exists  if 
and  only  if  the  dynamic  uncertainty  (as  quantified  by  the 
means  and  variances  of  the  uncertain  parameters)  satisfies 
a certain  threshold  condition.  If  this  threshold  is  exceed- 
ed, then  the  optimal  stationary  control  does  not  exist. 

Further,  the  results  obtained  for  the  discounted  • 

[ 

cost  problem  seem  to  imply  that  one  has  to  be  careful  in 
interpreting  the  stochastic  optimization  results,  and  that 

i 

j 

an  independent  stochastic  stability  analysis  should  be  per- 
formed. In  most  stochastic  optimization  problems  solved  to  ! 

date,  optimality  and  stability  are  not  in  conflict;  optimal 
feedback  controllers  result  in  stable  systems.  (The  system  t 

f 

may  be  inherently  unstable  in  the  absence  of  control.)  This  ! 

| 

is  clearly  not  the  case  for  uncertain  systems  in  which  the 
randomness  enters  in  multiplicatively  as  well  as  additively 
(such  as  in  the  standard  linear-quadratic-Gaussian  problems). 

The  results  on  the  optimal  linear  state  reconstruct- 


1 
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ion  for  systems  with  randomly  varying  parameters  give  a 
sufficient  condition  for  the  stability  of  the  linear  estima- 
tor. The  condition  turns  out  to  be  sufficient  to  ensure  that 
the  uncertainty  threshold  condition  in  Chapter  2 will  be  met. 
It  is  also  the  necessary  and  sufficient  condition  for  the 
asymptotic  variance  of  the  uncontrolled  linear  system  to  be 
finite . 

In  the  fixed  structure  linear  control  and  estimator 
design  for  systems  with  noisy  sensor  data,  the  filter  stabi- 
lity condition  dominates  the  control  stability  condition. 

If  the  filter  stability  condition  is  satisfied  then  the 
specific  structure  dynamic  compensator  has  steady-state  solu- 
tion. The  linear  controller  is  mean-square  stabilissable 
under  feedback  for  all  system  parameter  means  and  variances 
that  satisfy  the  linear  estimator  stability  criteria.  If 
the  linear  dynamic  compensator  is  stabil izable , then  the 
uncertainty  threshold  for  the  exact  measurement  case  is  sat- 
isfied. The  true  stability  criteria  for  the  case  with  random- 
ness in  the  measurement  equation  lies  between  the  above  two 
stability  conditions.  The  stability  region  for  linear  systems 
with  random  measurement  parameters  is  much  reduced  from  the 
exact  measurement  case,  but  it  is  larger  than  that  for  the 
linear  minimum  variance  filter  problem. 
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6. 3 Suggestions  for  Future  Research 

(A)  It  would  be  desirable  to  derive  the  uncertainty 
threshold  condition  for  the  linear  multivariable  systems 
both  for  the  exact  measurement  and  the  noisy  sensor  measure- 
ment cases.  It  is  obvious  that  the  uncertainty  threshold 
condition  will  involve  the  means  and  covariances  of  the  purely 
random  (white)  parameters.  In  the  perfect  measurement  case, 
the  analog  of  Eq . (2.4.13)  appears  to  be  a matrix  recursion 

of  tne  form 

K(t)  ps  M K( t+1 ) M' 

where  the  matrix  M contains  all  the  mean  values  of  the  vector 
parameters  and  their  covariance  matrices.  Non-existence  of 
a stationary  solution  for  the  infinite  horizon  problem  would 
result  if  an  eigenvalue  of  the  matrix  M is  greater  than  unity. 

(B)  The  results  given  in  this  thesis  can  be  used  to 
analyze  the  performance  of  aggregated  small  models  versus  the 
large  model.  The  approach  is  to  treat  certain  coefficients 
in  the  aggregated  model  as  being  purely  random.  The  variance 
of  the  coefficients  should  be  such  that  the  forecasts  of  the 
state  variables  generated  by  the  more  complex  model  would 
fall  within  the  three  standard  deviations  of  the  forecasts 
generated  by  the  aggregated  model.  To  accomplish  this  may 
require  some  of  the  uncertain  parameters  of  the  aggregated 
model  to  be  time-varying,  and  analytical  methods  will  have  to 


be  developed  to  determine  how  the  variances  are  to  be  chosen. 
It  is  conjectured  that  the  optimum  determination  of  the  para- 
meter covariance  matrix  can  be  formulated  and  solved  as  a 
deterministic  optimal  control  problem  using  the  minimum  prin- 
ciple. 

We  remark  that  the  use  of  random  coefficient  models 
in  economic  policy  analysis  is  very  common.  The  benefits  to 
economic  stabilization  policy  analysis  is  apparent  if  we  are 

able  to  devise  methods  for  evaluating  aggregation  costs  in  a 
well-defined  manner. 

(C)  An  important  aspect  in  designing  optimal  stocha- 
stic control  law  concerns  the  sensitivity  of  the  resultant 
system  to  large  parameter  variations.  The  analysis  of  sto- 
chastic systems  with  randomly  varying  parameters  can  further 
develop  and  aid  the  design  of  optimal  stochastic  controller. 
The  dependence  of  the  random  parameter  system  control  law  on 
the  parameter  covariances  can  systematically  indicate  which 
system  parameters  are  more  important  in  the  design  of  the 
closed-loop  system  controls.  An  extension  and  application 

of  the  theory  in  this  thesis  to  the  socio-economic  models  in 
[83]  would  demonstrate  the  importance  of  an  understanding  of 
the  random  parameter  systems. 

(D)  The  analytical  results  from  the  adaptive  stocha- 
stic control  of  linear  systems  with  white  parameters  are 
applicable  to  the  Multiple  Model  Adaptive  Control  systems 
design  [84]  since  in  the  MMAC  design,  we  hypothesize  a set 


of  possible  models  that  the  actual  operating  system  we  are 
trying  to  control  may  belong  to.  Additional  quantitative 
measure  can  be  introduced  into  the  analysis  by  assigning  the 
parameters  with  a priori  variances  to  reflect  the  uncertainty 
in  our  knowledge  of  the  random  system  parameters. 

(E)  The  results  in  this  thesis  research  can  be 
directed  toward  the  further  understanding  of  the  dual  control 
methods.  Since  most  dual  adaptive  control  algorithms  are 
computationally  iterative  in  nature,  the  assumption  of  pure- 
ly random  parameters  in  the  future  can  reduce  the  analysis 
required  to  generate  an  approximate  optimal  nonlinear  stocha- 
stic control  law.  We  have  seen  that  the  control  gains  of  the 
optimal  stochstic  control  system  are  strongly  modulated  by 
the  uncertainty  level  of  the  random  parameters.  These  re- 
sults can  be  used  to  refine  the  suboptimal  dual  control 
methods  so  as  to  preserve  the  planned  learning  concept , but 


reduce  the  real-time  computational  requirements. 
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APPENDIX  A 

DERIVATION  OF  THE  OPTIMAL  LINEAR  CONTROL  USING 
THE  MATRIX  MINIMUM  PRINCIPLE 

The  system  defined  by  difference  equations  (4.4.12)  - 
(4.4.18)  and  the  scalar  cost  functional  Eq.  (4.4.5)  are  in  the 
form  required  to  use  the  matrix  minimum  principle  [71]  . So, 

let  P(t)  be  the  co-state  matrix  associated  with  M(t).  The 
Hamiltonian  function  <^(M(t),  P(t+1),  G(t),  H(t+l),t)for 
our  problem  is  then 


#[M(t),  P(t+1),  G(  t ) , H(  t+1)  , t)  = tr  [Q(  t)M(  t )]  + 
tr  [(M(  t+1 ) - M(t))  P'  (t+1)]  ( A.  1) 


If  { G* ( t ) , t=0 , 1 , . . . , N-l } and  (H*(t+1),  t=0, 1 , . . . ,N-1 } 

are  optimal  gains  and  (M*(t),  t=0,l N}  is  the  optimal 

state,  then  the  discrete  minimum  principle  states  that  there 
exists  a co-state  (P*(t)  , t*0,l,...,  N}  such  that  the  fol- 

lowing hold: 

The  canonical  equations  are  given  by 


M*(t+1 ) - M*( t ) = 


3P( t+1 ) 


* 


(A. 2) 


P*(t+1)  - P*(t)  = 


3^ 

3M(  t ) 


* 


The  boundary  conditions  are  given  by 
M*( 0)  = M(0) 

P*(N)  = F 


(A. 3) 

(A. 4) 
(A. 5) 
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First,  we  expand  the  terms  in  the  Hamiltonian  to  obtain 

P(t+1),  G(t) , H(  t+1 ) , t]  = [Q(t)+R(t)G2(t3Moo(t) 

- G2(t)R(t)MiQ(t)  - G2(t)R(t)MQi(t)  + 

G2(t)R(t)Mn(t)  + ( MQ  Q ( t+1 ) - M00(t))po0(t+l) 

+ (“01<t+D-  “oi<t>)Po,<t+1>+(M1o(t*1>-  M10(t)) 

P (t+1)  + (m  (t+1)  - M (t))  P(t+1) 

10  \ 11  11  / 11  (A. 6) 

From  Eq.  (A. 3),  the  components  of  the  co-state  matrix  P are 
given  by 

P*  (t)  = Q(t)  + G2 ( t )R( t )+  Qa(t)-  b(t)G(t))2  + l (t)  + 

0 0 aa 

[bb(t)G2(t)]P*00(t  + 1)  + &-H(t+l)  c(t  + l)l 

Caa(t>  + <t+1>  + p*0<t+1>] 

+ [(l-H(t+l)c(t+l))2  (laa(t)  + Ibb(t)G2(t)) 

+ Icc(t+1)  H2 ( t+1 ) ^ ( a( t )-  b(t)G(t))2  + Iaa(t) 

+ W'>G!<t>)]  P*i  (t+1)  (A  ?) 

p*oi(t)  = - G2 ( t )R( t )+  P*Q  (t+l)[b(t)G(t)  ( a(t)-b(t)G(t)|  - 
Ibb(t)G2(t)]  + [(l-H(t+l)c(t+l))  U2(t)  - 
Ibb(t)G2(t)  - a(t)b(t)G(t)]  P*x  (t+1)  + Pjx(t+1) 

[ (l-H(t+l)c(t+l))2  Ibb( t )G2 ( t ) + Icc(t+l)H2(t+l) 
^a(t)b(t)  - (b2 ( t)  + Ibb(t))G(t)^  G(t)  j 
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P*,(t)  = G2(t)R(t)  + [ Ubb(t)  + b2(t))  C,2(t)|  P*0(t  + 1) 

+ J^|l-H(  t-1  )c(  t + 1 ))  ( a(  t )b(  t )G(  t ) + i;bb(t)G2(t))j 

(P*Q(t  + l)  + P*j(t+l)j  + P^U+l)  ^(l-H(t+l)c(t  + l)f 
fa2(t)  + Ebb(t)G2(t))  + Zee(t+l)H?(t  + l) 

(b2(t)  + Zbb(t)}  G2  ( t ) | (A. 9) 


For  every  G(  t ) and  H(t),  t = 0,1,2,...,  N-l, 


'■jL  LM*(  t ) , P*(t+l),G*(t),H*(t+l)]<  j£.[u*(t)  ,P*(t  + l ) , G(  t ) , H(  t+1  )] 


(A. 10) 


Since  the  "controls”  G(t)  and  H(t)  are  unconstrained  in  this  prob- 
lem, the  necessary  conditions  for  the  minimization  of  the  Hamil- 


tonian function  are 


3G(t) 


a A 

3H( t+1 ) * 


= 0 


(A. 11) 


We  obtain  from  the  necessary  conditions  3>t/3G  that 

0 = b(t)  ^cc(t+l)H2(t+l)PM(t+l)+  P00(t  + l^a(t)[M00(t)  - 

M01(t)]+  b(t)Poi(t+l)^i-H(t+l)c(t+l^a(t)[Moi(t)  -MU)] 
- | b2(t)(i  (t+l)H2(t+l)P  (t+l)+  P (t  + U+  R(t)  + 

[_  \ CC  11  00  / 

£bb<t)(vt+1>H2<t+1)P,,(t+1> + Vt+I>)+ 

fl-H(t+l)c(t+l)]'Pi t(t+l)+  [l-H(t+l)c(t+l)]po](t+l)l 
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X ll(t  > [MooU)  ~ 2Mci(t  > + Ml  1(L)] 


(A. 12) 


and  from  iV/in  that 


0 = P ( t + 1 ) 
l lv 


K 


!l(t+l)c2(t+l)(a2(t)Mu(t)  + 2(t)  + l!bb(t)G2(t) 
f M 0 o ( t )-2M  ( l ) f M ( t ) ) f E (t)M  (t)J  + ll(t  + l)E  (t  + 1) 

uu  oi  ii  an  oo  ' l’C 

(< 

^ M ( t. ) - 2M 


o u v ' ' —or 

1„0U)  (set)  - i.COGCOp*  Mo(l)E!ia<l>+  I„b(t)G2(t) 

00.  , „,(t)  + Mn(o)  - 2M01(t)(-  a(  t >b(  t )G(  t ) 

+ b2 ( t )G2 ( t II, 


bb 


Mn(t)b2(t)G2(  t ) 

) * 

H( t+ 1 ) 0( t + 1 ) - 

■ j l ( t ) + = ( t ) + 

£„ 

u<t>Moo<t)  + 

; ( t ) -2M 

, , ( t)  + 

M 

(1)^1  - 

0 0 

0 1 

l 

1 0 J 

^ b(t)G(t)Mi) 

(t) 

+ (a(t)-b(t)G(t 

¥ 


.nl(t))a(t)  + Saa(t)M00(t)  + 5'bb( 1 )G  (t ) 


>] 


(Moo(t)  - 2Moi(t)  f Mll(t))  + 5(t) 

t = 0,1 N-l 

If  we  assume  that  the  orthogonality  condition  holds, 


(A. 13) 


E(  x(  t ) - x(  t ) x(  t ) } 


0 


(A. 14) 


so  that  M Q j ( t ) = M t ) and  assume  P (t+1)  = 0,  the  Eq, 


(A. 12)  is  satisfied  if  the  optimal  control  gain  is  given  by 
C(t)  = [b(t)(Ecc(t  + l)H2(t+l)Pu(t+l)+  PQl(t))a(t)  Y^b2(t)  h 


Ebb(t))  Wt+l)H2(t+,)Pn  (t+1)  + Pon(t)  + R(t) 


+ 5:bb(t) 


1- 


1 ’ 


H( t+1 )c( t+1 ) P (l+l) 


>1 


(A. 15) 


1 


1 
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and  we  conclude  immediately  from  Eq.  (A. 13)  that 


H(  t+1 ) = i(t+l)c(t+l  )|  c (t  + DMU+l)  + 


£cc(t+i)Moo(t+n  + e(t+i ) 


(A. 16) 


where 


Mu(t  + 1)  = a?(t)Mil(t)  + E ( t ) + Ibb(t)G2(t)[M00(t)-Mii  ( t )] 

(A. 17) 

Substituting  the  optimum  gain  Eq.  (A. 15  ) into  (A. 8),  we 
obtain  that  PQi(t)  = 0.  Since  we  choose  P^^N)  = 0,  we  con- 
clude that  P (t)  = P (t)  = 0 for  all  t t [0,N]. 

0 1 10 

The  difference  equation  for  the  correlation  between  state 
estimate  and  the  estimation  error  is  given  by 


l{(x(t)  - x ( t ) ) x ( t ) l = Mni(t)  - Mn(t) 


(A. 18) 


It  can  be  shown  that  if  MQ1(t)  = Mn(t)  and  the  filter  gain 
given  by  (A. 16)  then  M (t+1)  = Mu(t+1).  Since  by  choice 
of  initial  condition,  Eq.  (4.4.19),  M^fO)  = M01(0).  we  con_ 
clude  that  M^Ct)  = M Q ^ C t ) for  all  t c 1 0 , N ] . 


The  filter  and  control  gains  for  the  deterministic  optim- 
ization problem  are  given  by  Eqs.  (A. 15)  and  (A. 16).  The  co- 
states  propagate  backwards  according  to 
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P00(t)  = (a(t)  - b(t)G(t))2  (j:cc(t  + l)H2(t  + l)Pn(t  + l)Po(t  + l)) 

+ Q( t ) + naa(t)(icc(t  + l)H2(t+l)Pi(t+l)  + Pqo  (t+l)j 
+ G2(t)  [jl(t)  + ^b(t)  (zcc(t+l)H2(  t+l)Pn(t+l  (t+1) 

+ (l-II(t+l)  c(t+l)]2  p <t  + l))] 


P00(N)  = F 


(A. 19) 


•M(t)  = a2(t)  (l-H(t+l)c(t  + l))  2Pn(t  + l)  + G2  ( t ) ( t ) 

+rbb(t))  ^(t  + l)H2(t  + l)P11(t  + l)+  PQo(t  + l)) 

b(t)  (l-H(t+l)c(t+l)]  2Pii(t+l)] 


+R( t ) + 


Pjj(N)  = 0 


(A. 20) 


Note  that  the  co-state  equations  are  coupled  nonlinear 
difference  equations. 

The  closed- loop  system  transition  parameter  is  given 
by 

♦ (t)  = a(t)|  1—b2  ( t ) (Ecc(t+l)H2(t+l)  Pn(t+1)+  P0(J(t+l)) 
[(b2(t)  + Ebb(t))  (zc^t+l)H2(t+l)PM(t+l)+  P00(t+1)) 
+ R(t)  + Ebb(t)  ( l-H(t+l)c(t+l)j2  Pn(t+l)J"1| 

(A. 21) 


r ! 


* 


M 
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APPENDIX  B 


DERIVATION  OF  THE  OPTIMAL  LINEAR  CONTROL 
USING  DYNAMIC  PROGRAMMING 


i 


The  optimal  control  problem  stated  in  Eqs.  (4.5.34) 
to  (4.5,36)  is  solved  in  this  appendix  using  the  dynamic 
programming  method.  At  time  t=T-l,  the  expected  cost-to-go 
is  given  by 

J(T-1 , x(T-l))  = e|fx2(T)  + Q(T-l)  x2(T-1) 


+ R(T-l)  u (T-l) 


T- 


1 T-2\ 

’U  ( 


(B.l) 


Imposing  the  constraint  that  the  control  u(t)  be 
given  by  the  linear  time-varying  feedback  law  of  the  form 

u(t)  = - G(t)  x ( t ) (B . 2 ) 

we  get 

J( x(T-l ) ,T-1 ) = min  E 1q(T-1)  x2(T-1)  + R(T-l)  G2(T-1) 
G(T-l)  * 

H(T-l) 

• x2(T-l)  + F[a2(T-l)  x2(T-1) 

+ b2(T-l)  G2(T-1)  x2(T-1) 

- 2a(T-l)  b(T-l)  x(T-l)  G(T-l)  x(T-l) 


+ £2(T 


Ln  E j (q(T-1)  + F a2(T-l)]  x2(T-l) 


.T-l 


min 
G(T 
H(T-l) 

+ e|(R(T-1)  + F b2(T-l) ) G2(T-1)  x2(T-1) 


- 2Fa(T-l)  b(T-l)  C,(T-1)  x(T-l)  x(T-l) 


.T-l 


+ F 5 ( T-l ) 


(B . 3) 


' m 


f ! 


I 
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To  compute  the  cost-to-go  expression  we  need  to 
evaluate  E{x( t ) x( t ) | zL } and  E{x  (t)|zc)  and  obtain  their 
dynamical  equations. 

The  state  of  the  constrained  linear  controller- 
estimator  system  are  given  by 

x(t  + l)  = a(t)  x(t)  - b(t)  G(t)  x(t)  + £(t)  (B.4) 

Denote  the  predicted  estimate  by  x( • ) then 

x(t+l)  = (1  - H(t+1)  c(t+l)>  x(t+l)  + H(t+1)  z(t+l)  ( B . 5 ) 

The  estimate  x(t+l)  is  a random  process  since  it  depends 
on  z(t+l),  the  measurement  process. 

The  predicted  state  estimate  is  given  by 
x(t  + l)  = (a(t)  - b(t)  G(t))x(t)  ( B . 6 ) 

Using  Eq . (B.4)  we  calculate  the  quantities 

E jx(  t ) x(  t ) | z1  j = E j[(l  -H(t)  c(t))  x(t) 

+ H(t)  z(  t )]  x ( t ) z*j 

= E{x( t ) e(  t ) } 

+ H(t)c(t)S  (t|t-l)  +E{x2(t)} 

XX 

+ H(t)  c(t)  E{e(t)  x(t.)}  (B . 7) 

where  e(t)  =x(t)  - x( t ) . 

From  the  derivation  of  the  optimal  solution  to  the 
deterministic  problem  in  which  the  control  is  constrained  to 
be  linear  mapping  of  the  outputs  of  linear  filter  driven  by 
measurement,  it  was  shown  with  the  assumption  of 

E{e(0)  x(0)}  = 0,  e(0)  = x(0)-x(0)  (B.8) 

the  filter  and  control  gains  given  by  Eq . (4.5.1)  and 


Eq.  (4.5.2)  jointly  satisfy  the  necessary  conditions  for 
optimality.  In  addition,  the  orthogonality  condition 
E{  x(  t ) e(  1 1 1 ) } = 0 for  all  te[0,N],  so  that  the  estimate 
and  the  error  are  uncorrelated. 

Then  we  have 

E { x ( t ) x(t)|zt}  = X(t)  +H(t)  c(t)  E ( 1 1 1-1 ) ( B . 9 ) 

XX 

where 

X(t)  £ E{x2(t)}  (B.10) 

E(i2(t)|zt}  = E{(1  -H(t)  c(t))2x2(t)  +H2(t)  z2(t) 

+ 2(1  -H(l)  c(t))  x(t)  H(t)  z(t)|zt} 

= X(t)  + H2(t)(0(t)  + c2(t)  Exx(t) 

+ Ecc(t)  X( t ) ) = X(t)  (B.ll) 

where  the  orthogonality  conditions  was  used,  and  we  define 

X(t)  £ E{x2( t ) } = Exx(t)  + X(t)  ( B . 12) 

Substitute  these  results  for  E{x(T-l)  x(T-l ) j z^-1 } 
and  E{x  (T-l)|z  } into  the  cost-to-go  we  obtain 

J(T-l)  = E jj^Q(T-l)  + F a2(T-l)J  x2(T-l)  |zT_1J  + eJ^R(T-I) 

+ F b2(T-l)j|  G2(T-1)(X(T-1) 

+ H2(T-l)(c2(T-l)  Ixx(T-1)  + 0(T-1 ) ) 

+ H2(T-1)  Ecc(T-l) X(T-l) ) 

- 2F  a(T-l) b(T-l)  G(T-l) (X(T-l) 

+ H(T-l)  c(T-l)  I (T-l))  + F E(T-l)  (B.13) 

We  want  to  minimize  the  cost-to-go  with  the  G(T-l) 


and  H(T-l).  The  necessary  conditions  for  optimality  are 
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obtained  by  sotting  the  partial  derivatives  3J/3(1(T-1)  und 
3J/3H(T-1)  to  zero,  respectively.  Therefore, 


3J 

3C»( T-l ) 


= 0 = [r(T-1)  + F(b2(T-l)  + Ebb(T-l))]  G(T-l) 
• ( X( T-l  ) + H2(T-1)(c2(T-1)  ^xx(T-1) 

+ 0(T-1)  + r.  (T-l)  X(T-l))) 


- F a(T-l)  b(T-l ) ( X(T-1 ) 


+ H(T-l)  e(T-l)  EXX(T-1)) 


(B.14) 


gG~ry  - 0 = [R(T-D  + F(b2(T-l)  + Z 

♦ 


bb(T_1))] 


• H( T-l ) ( c ( T-l ) Xxx(T-l ) + 0(T-1  ) 


+ E^.(T-l)  X(T-1  ) ) 


F a (T-l ) b(T-l)  r,(T-l)  c(T-l)  E (T-l) 


(B.15) 


Multiply  the  first  Eq . (B.14)  by  Ci(T-l)  and  the 
second  Eq.  (B.15)  by  H(T-l),  then  subtract  Eq . (B.15)  from 
Eq . (B.14)  we  obtain  that 


G (T-l)  = 


F a(T-l)  b(T-l) 

R(T-l)  + F(b2(T-l ) + Ebb(T-l) 


(B.1G) 


Substitute  this  optimal  control  in  Eq . (4.4.75) 


get  then 


H (T-i)  = 


Exx(T-l)  c(T-l) 


c2( T-l  ) T.  (T-l)  + G(T-l)  + Z (T-l  ) X(T-1  ) 
\ ' XX  cc 


(H. 17) 
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where 

I(T-l)  = a2(T-2)  E (T-2)  + E (T-2)  X(T-2) 

XX  XX  dd 

+ Ebb(T-2) G2(T-2) X(T-2)  + H(T-2)  (B.18) 

To  evaluate  the  cost-to-go  at  t = T-l , we  substitute 

* 

the  optimal  linear  control  gain  G (T-l)  and  filter  gain 
H*(T-1)  into  the  cost-functional  Eq.  (B.13)  to  get 

J(x(T-l),T-l)  = [F(a2(T-l)  + Eaa(T-l))  +Q(T-1)] 

• E{x2(T-1) |zT-1} 

[Fb(T-l)  a(T-l)]2 

[R(T-l)  +F(b2(T-l)  + Ebb(T-l))] 

• E{x2(T-l) |zT*1} 

[f  a(T-l)  b(T-l)]2 
[r(T-1)  + F(b2(T-l ) + £bb(T-l))] 

• E(x(T-l)  x(T-l) |zT_1} 

+ F E(T-l) 

- e|k(T-1)  x2(T-1) |zT_1|  + G2(T-1)  [r(T-1) 

+ F(b2(T-l)  + Zbb(T-l)  )]  Exx(T-1) 

+ F E(T-l)  (B. 19) 

where  we  define  the  variable 

K(T-l)  = F(a2(T-l)  + E (T-l))  + Q(T-l)  - G2(T-1) 

aa 

• [r(T-1)  + F(b2(T-l)  + Ebb(T-l))]  ( B . 20) 

and 

k(T-l)  = G2(T-1)[r(T-1)  + F(b2(T-l)  + Ebb(T-l))] 


( B . 21 ) 
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Now  we  will  proceed  to  find  the  optimal  G(T-2)  and 
H(T-2),  using  the  Principle  of  Optimality,  we  have 

J(x(T-2) ,T-2)  = E | K(T-l)  x2(T-l)  + k(T-l)  Exx(T-l) 

+ F E(T-l)  + Q( T-2 ) x2(  T-2 ) 

+ R(T-2)  G2(T-2)  £2(T-2) | zT_2}  (B.22) 

Since  the  covariance  of  the  estimation  error  is  not  indepen- 
dent of  the  past  controls,  we  have  to  include  it  in  the 
recurrence  functional  equation.  The  dependence  on  x(t)  and 
7}  is  evident  in  the  covariance  propagation  equation  for 
Ixx(t),  Eq . (B.14).  The  problem  of  estimation  is  no  longer 
separable  from  that  of  the  control . 

Hence  the  cost-to-go  becomes 
J (»x(T-2 ) ,T-1 ) = E|(a2(T-2)  K(T-l)  + Q(T-2))  x2(T-2)  |zT“2} 

+ e|[r(T-2)  +K(T-1)  b2(T-2)] 

G2(T-2)  x2(T-2) |zT'2} 

- 2K(T-1 ) E{a(T-2)  b(T-2)}  G(T-2) 

• e|x(T-2)  x(T-2 ) |zT_2| 

+ K(T-l)  E(T-2)  + k(T-l)  Exx(T-l) 

+ F S(T-l)  ( B . 23 ) 

We  need  to  expand  the  expression  for  the  error  co- 
variance  Exx(T-l|T-l)  from  Eq . (B.14) 

E (T-l ) = (1  -H(T-l)  c(T-l))2 Evv(T-l) 

Xa  XX 

+ H2(T-1)  [ecc(T-1)  X(T-l)  + 0(T-1)] 


( B . 24  ) 
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and  the  predicted  covariance  is  given  by 

Ixx(T-1)  = a2(T-2)  Ixx(T-2)  + ^aa(T-2)  x(T-2) 

+ Ibb(T-2)  G2(T-2)  e|x2(T-2)  |zT_2|  + H(T-2) 

( B . 25) 

The  cost-to-go  is,  therefore, 

J(T-2 ) = T(a2(T-2)  + E (T-2) ) K(T-1 ) + Q(T-2) 

L £121 

+ k(T-l ) ( 1-  H(T-l)  c(T-l))2  E (T-2  )1 

Sick.  j 

• e|x2(T-2 ) | zT_2| 

+ £r(T-2)  + K(T-1 ) (b2(T-2)  + Ebb(T-2)) 

+ (1  -H(t-l)  c(T-l))‘?  Ebb(T-2)  k(T-l)J  G2(T-2) 

• e{x2(T-2) |zT-2J 

- 2K(T-1)  a(T-2)  b(T-2)  G(T-2) 

• E jx(T-2 ) x(T-2 ) | zT_2| 

+ k(T-l)[a2(T-2)  ^xx(T-2) 

+ H(T-2)]  (1 -H(T-l)  c(T-l))2 
+ k(T-l)  H2(T-1)  [ecc(T-1)  X(T-l)  + Q(T-l)] 


+ K(T-l)  E(T-2 ) + F E(T-l) 


(B.26) 


Substituting  the  expressions  for  E{x(T-2)  x(T-2) } 
E{x2(T-2) } , E (T-2),  and  X(T-l)  yields 

XX 

X(T-l)  = (a(T-2)  -b(T-2)  G(T-2) )2 X(T-2) 

+ 2b(T-2)  G(T-2 ) ( a(T-2 ) - b(T-2)  G(T-2))  S(T-2) 

+ H(T-2)  +b2(T-2)  G2(T-2)  E (T-2) 

+ E (T-2)  X(T-2)  + E. . (T-2)  X(T-2)  G2(T-2) 

aav  bb  ( B . 27 ) 
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where  we  identify  S(t)=MQ1(t),  Eq . (4.4.18). 

J (T-2)  F(a2(T-2)  + E (T-2 ) ) K(T-1 ) + Q(T-2 ) 

L del 

+ k(T-l)(l  -H(T-l)  c(T-l))2  I (T-2) 

El  cl 

+ k(T-l)  II2(T-1)  Ecc(T-l)(a2(T-2) 

+ Eaa(T-2)  )]  e|x2(T-2 ) | zT'2} 

+ [r(T-2 ) + K(T-1 ) (b2(T-2)  + ^bfe(T-2) ) 

+ Ebb(T-2)(l -H(T-l)  c(T-l))2 k(T-l) 

+ k(T-l)  H2(T-1)  Ecc(T-l)(b2(T-2) 

+ Ebb(T-2))]  G2(T-2)  |x(T-2) 

+ H2(T-2)(c2(T-2)  I ( T-2 ) + 0( T-2 ) 

XX 

+ Ecc(T-2)  X(T-2))J 
- [2K(T-1)  a(T-2)  b(T-2)  G(T-2) 

+ 2a ( T-2 ) b(T-2 ) G(T-2)  k(T-l)  H2(T-1)  Ecc(T-l)J 
• [x(T-2)  + H(T-2)  c(T-2)  Zxx(T-2)] 

+ k(T-l  ) ( 1 -H(T-l)  c(T-l))2[a2(T-2)  Zxx(T-2) 

+ 5(T-2)]+ k(T-l)  H2(T-1)  O(T-l) 

+ k(T-l)  H2(T-1)  Zcc(T-1)  E(T-2) 


+ K(T-l)  S(T-2) + F E(T-l) 


(B.28) 


Carrying  out  the  algebraic  minimization,  we  get 
- 0 = [R(T-2)  +K(T-l)(b2(T-2)  + Zbb(T-2)) 

+ Zbb(T-2)(l  -H(T-l)  c(T-l))2  k(T-l) 
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+ (b2(T-2)  + Ebb(T-2))  H2(T-1)  ^cc(T-1) 


k(T-l  G(T-2)  [x(T-2> 


+ H2(T-2)(c2(T-2)  Exx(T-2) 


+ 0( T-2 ) + Ecc(T-2)  X(T-2))] 

- [a(T-2)  b(T-2) k(T-l)  H2(T-1)  Ecc(T-l) 
+ K(T-l)  a(T-2)  b(T-2)]  [x(T-2) 


+ H(T-2)  c(T-2 ) Exx(T-2)] 


( B . 29) 


and 


3 J 


3H(T-2) 


I - °-[ 

* 


r 1-2  , 


= | R(T-2)  + K(T-l)(b  (T-2)  + Ebb(T-2)) 


+ E..  (T-2 ) ( 1 -H(T-l)  c(T-l ) ) k(  T-l ) 
bb 


+ k(T-l)(b2(T-2)  + Ebb(T-2))  H2(T-1)  ^CC(T-1)] 


G2(T-2)  H(T-2 ) ( c2(T-2 ) Exx(T-2) 


+ 0(T-2)  + E (T-2 ) X(T-2 ) ) 

CO 


- [k(T-1)  + H2(T-1)  Ecc(T-l)  k(T-l)] 


• a(T-2)  b(T-2)  G(T-2)  c(T-2)  ExJ{(T-2) 


(B . 30) 

Multiplying  the  first  equation  by  G(T-2)  and  the 
second  equation  by  H(T-2)  we  get  that  the  solution  is  given  by 


G (T-2)  = 


b(T-2 ) a(T-2)  (K(T-l)  +H*(T-1)  E (T-l)  k(T-l)) 

vv 


R(T-2)+(b2(T-2)+Ebb(T-2))(K(T-l)+H2(T-l)Ecc(T-l)k(T-l)) 


+ E.  . (T-2)  (1  - H(T-l)  c(T-l))  k(T-l) 
bb 


» H 


I 


( B . 3 1 ) 
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II*  ( T-2 ) 


c(T-2 ) Ixx(T-2) 

c2(T-2)  I (T-2)  + 0( T-2 ) + E , ,(T-2)  X(T-2) 

XX  c c 

( B . 32 ) 


The  cost-to-go  when  evaluated  at  t = T-2  is  given  by 
J (T-2 ) = [(a2(T-2)  + Eaa(T-2))(K(T-l)  + k(T-l)  H2(T-1)  Ecc(T-l)) 

+ Q(T-2)  k(T-l)(l  - H(T-l)  c(T-l))2  E (T-2) 

£lcl 

- O2 ( T-2 ) ( R( T-2 ) + ( b2( T-2 ) + Ebb(T-2)) 

• (K(T-l)  + k(T-l)  H2(T-1)  E (T-l) ) 

+ Ebb(T-2)(l  - H(T-l)  c(T-l))2  k(T-l))]  X(T-2) 

+ G2(T-2) [r(T-2)  + ( b2(T-2 ) + Ebb(T-2 ) ) ( K( T-l ) 

+ k(T-l)  H2(T-1)  ecc(t-1)) 

+ Ebb(T-2)(l  - H(T-l)  c(T-l))2k(T-l)]  ^xx(T-2) 

+ k(T-l)(l  - H(T-l)  c(T-l)  )2 a2(T-2)  E (T-2) 

XX 

+ k(T-l)  [(1  - H(T-l)  c(  T-l ) ) 2 5 ( T-2 ) 

+ Et^(T-l)  H2(T-1)  H( T-2 ) + Ecc(T-l)  ©(T-l)] 

+ K(T-l)  E(T-2)  + F H(T-l) 

T-l 

= E^K(T-2)  x2(T-2|zN~2} + k(T-2)  E (T-2)  + l K(t+l)S(t) 
( " xx  T—2 

+ k(t+l)  [(1  - H(t+1)  c(t  + l)  )2  H (t) 

+ H2(t+1)  Ecc(t  + 1)  =(t)  +H2(t+1)  0(t  + l)]  ( B . 33) 


where  we  define 

K(T-2)  = (a2(T-2)  + E (T-2 ) ) ( K(T-1 ) + E (T-l ) H2(T-1 ) k(T-l ) ) 

u n c c 

+ Q(T-2)  + E (T-2)(l  - H(T-l)  c(T-l))2  k(T-l) 

cl  cl 

( B . 34  ) 
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- G2(T-2) [r(T-2)  + (b2(T-2)  + Ebb< T-2 ) ) ( K( T-l ) 


+ H (T-l ) i:cc(T-l)  k(T-l)) 

9 (Concluded) 

+ Ebb(T-2)(l  - H(T-l)  c(T-l))  k(T-l)J  (B.34) 

k(T-2 ) = a2(T-2)(l  - H(T-l)  c(T-l))2  + G2(T-2)  [r(T-2) 

+ ( b2(T-2 ) + Ebb(T-2))(K(T-l)  + H2(T-1) 

Ecc(T-l)  k(T-l)) 

+ Ebb(T-2)(l  - H(T-l)  c(T-l))2  k(T-l)]  (B.35) 

Using  the  Principle  of  Optimality  we  have  that 

J( x(T-3 ) , (T-3 ) ) = min  e{j(T-2)  + Q(T-3)  x2(T-3) 

G(T-3)  1 

H(T-3) 

+ R( T-3)  G2(T-3)  x2(T-3) | zT_3| 

= min  E ] K(T-2)  x2(T-2) 

G(T-3)  a( • ) ,b( • ) , c( • ) ( 

H(T-3)  5(*),0(*) 

+ k(T-2)  Exx(T-2|T-2)  + Q(T-3)  x2(T-3) 

+ R(T-3)  G2(T-3) x2(T-3) |zT_3|  (B. 36) 


This  is  exactly  identical  to  the  form  of  cost-to-go  expression 
in  Eq.  (B.22)  except  for  the  indices.  By  induction  on  t , we 
obtain  the  solution  to  the  optimum  constrained  linear 
estimator-controller  system  problem, 


u(t)  = - G(t)  x ( t ) 


(B . 37) 
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where 

G(t)  = [R(t)  + (b2(t)  + Ebb(t))(K(t+l)  + H2(t+1) 

• Ecc(t  + 1)  k( t + 1 ) ) 

> Ibb(t)(l  - H(  t + 1)  c(  t + 1 ))2  k(  t + 1)]"1  b(  t ) a(  t) 

• [k(t+l)  + H2(t+1)  E ( t + 1)  k ( t + 1 )]  (B.  38) 

K(t)  = (a2(t)  + I (t))(K(t+l)  +H2(t+1)  Z (t+1 ) k(t+l))  +Q(t) 

a U C C 

b“(  t ) si2(t  )(K(t  + l)  + H 2 ( t + l ) 5:cc(  t + 1 ) k(t  + l))2 
[R(  t)  + (Ebb(t)  + b2(t ) ) ( K( t + 1 ) + H2( t + 1)  Ecc( t + 1 ) 


k(t) 


• k( t + 1 ) + Ebb(t)(l  - H(  t + 1 ) c(t  + l))2  k(t  + l)] 

( B . 39) 


a2(t)(l  - H ( t + 1 ) c(t+l))2 

b2(  t ) a2( t ) (K( t+1 ) + H2( t + 1 ) Ecc(t+1)  k(  t + 1 ) )2 

[r(  t ) + (Ebb(t)+b2(t))(K(t+l)  + H2( t + 1 ) Ecc(t+1) 


H(t)  = 


k(  t + 1 ) ) + Ebb(t)(l  - H(  t+1 ) c(t  + l))“  k(  t + 1 )] 

( B . 4 0 ) 
(B . 41 ) 


c(t)  E 


xx 


bb' 

(t) 


c ( t ) Exx(t)  + 0(t)  + Ecc(t)  X(t) 


and 

x ( t ) = (1  -H(t)c(t))  (a(t-l)  -b(t-l)  G(t-l))  x(t-l) 

+ H(t)  z(t)  ( B . 42 ) 

where  E (t)  is  Riven  by  Eq . (4.4,15)  and  Eqs.  (4.4.12)  to 

X X 

(4.4.15)  if  we  identify  I ( t ) = Mu(  t ) and  E ( t ) = Mj  ( t ) , 
and  X( t ) = Mqo( t ) . 
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